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Erythroblastosis And Perceptive Hearing Loss: 


Responses Of Athetoids 


To Tests Of Cochlear Function 


ROBERT W. BLAKELEY 


Erythroblastosis fetalis has been defined 

(14) as 
a condition which becomes manifest late in 
fetal life or soon after birth, with exces- 
sive destruction of red blood cells and 
extensive compensatory overdevelopment 
of erythropoietic tissue. It may occur as a 
result of transplacental passage of an anti- 
Rh agglutinin produced in an Rh-negative 
mother who has been immunized by the 
Rh-positive red cells of the fetus or by a 
transfusion of Rh-positive blood. 


This process of ‘iso-immunization’ has 
now been recognized as resulting from 
several erythrocytic antigens of which 
the Rh factor (or D factor) is only 
one. 

‘Kernicterus,’ sometimes erroneously 
used synonymously with ‘erythroblas- 
tosis fetalis,’ is a pathological term re- 
ferring to the yellow staining often 
found in the brain of erythroblastotic 
patients at autopsy and is now known 
also to exist in association with clinical 
entities other than erythroblastosis 
(46), usually with manifest central 
nervous system disorder. The late clin- 
ical signs of kernicterus following 
erythroblastosis fetalis, according to 
Perlstein (38), are so characteristic that 
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the etiology may be suspected from the 
examination of the patient. The out- 
standing sequela of kernicterus is ath- 
etosis (21, 38), ‘presumably due to in- 
volvement of the basal nuclei and/or 
their cortical connections’ (38). 
Deafness also is recognized as a com- 
mon sequela of kernicterus caused by 
erythroblastosis fetalis. The literature 
contains a number of contradictory 
suppositions and conclusions regarding 
the nature of this hearing loss and the 
site of the auditory lesion. Hearing in- 
volvement is reported to occur in from 
4% to more than 50% of such cases. 
A summary of 168 erythroblastotic 
cases reported throughout the litera- 
ture shows that 17% had perceptive 
hearing loss, while a similar survey of 
the literature involving 941 congenitally 
deaf cases indicates that 3% had his- 
tories of erythroblastosis. “Typical’ au- 
diograms reported by Crabtree and 
Gerrard (8) and others (25, 38) show 
characteristic high-frequency percep- 
tive-type deafness. The loss is also de- 
scribed as bilaterally symmetrical (2/). 


Generally it is presumed that deaf- 
ness from kernicterus occurring with 
erythroblastosis is due to pigment dep- 
osition and loss of cell population in 
the cochlear nuclei on a bilateral basis. 
This presumption is based on pathologi- 
cal studies by Bertrand (3), Potter 
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(39) and Dublin (14). The latter 
writes, 


lesions of the important nuclear stations 
of the perceptive auditory pathways are 
shown which serve as the basis for the 
deafness that may follow erythroblastosis 
fetalis, and the relation of the injured 
nuclear centers to bilateral representation 
of hearing is in keeping with the bilaterally 
similar pattern of loss of hearing. 


However, the results of testing of pa- 
tients by psychogalvanic methods 
caused Byers, Paine and Crothers (4) 
to conclude that the auditory lesion is 
at a higher, cortical level. Support for 
this belief also comes from those who 
suggest that the ‘Rh deaf’ child may be 
‘aphasoid’ rather than lacking in hear- 
ing acuity. Rosen (40) points out the 
fluctuating nature of hearing in these 
cases. He studied 33 erythroblastotics 
who had been diagnosed as deaf and 
concluded that 11 had normal hearing 
and seven had less than 25 db losses. A 
‘preponderance of aphasic-like language 
disturbances in the athetoid group, 
seemingly as a result of kernicterus’ 
has been described by Cohen (6). 
Myklebust (37) states that ‘one cannot 
avoid the question of whether the Rh 
child’s problem might be . . . a mix- 
ture of deafness and aphasia.’ This same 
line of reasoning is followed by Good- 
hill (22) when he points out that the 
neurologic lesions are inconsistent and 
widespread and thus may involve any 
portion of the auditory pathways, re- 
sulting in various combinations of cen- 
tral auditory injury. 

Most writers have hypothesized that 
inasmuch as athetosis results from cen- 
tral damage, it should follow that 
when perceptive deafness occurs in 
these cases the lesion causing the deaf- 
ness also must be central. Dublin (/4) 
tends to disagree wth this hypothesis 
when he points out that 


the lesions producing athetoid cerebral 
palsy, whether located in the globus palli- 
dus, substantia nigra, or premotor area of 
the cortex, might well be caused by anoxia 
also, but not likely by any lesion common 
to the extrapyramidal motor pathways and 
the auditory paths, since the two are 
everywhere widely separated. 


For the most part, the supposition 
that the cochlea might be damaged as 
a result of erythroblastosis fetalis has 
not been pursued. Pathologic data with 
reference to the cochlea in this type 
of deafness are limited and can be 
briefly summarized. Gerrard (19) ex- 
amined serial sections of the cochlea 
in two cases of kernicterus who died 
during the neonatal period. These cases 
showed extensive destruction of the 
nerve cells in the cochlear nuclei but 
no abnormality was found either in 
the organ of Corti or in the spiral 
ganglion of Corti. Unpublished tem- 
poral-bone studies of Wolff and Good- 
hill are mentioned by Goodhill (22) 
in which no end-organ involvement 
was discovered but cellular destruction 
in the spiral cochlear ganglion was 
demonstrated. Marullo (34) studied the 
otic capsules by x ray in twins of 
‘maternal isoimmunization’ and noted 
a retardation in the development of 
the periosteal layer in each case. Au- 
topsy was performed by Kelemen 
(28) on a three-day-old infant who 
died of erythroblastosis fetalis. No kern- 
icterus was discovered in the brain 
(the child was jaundiced). The tem- 
poral bones were studied microscop- 
ically and revealed cochlear distortion 
which presumably would impair au- 
ditory perception. It should be noted 
that in none of these pathologic studies 
was the child known to have had a 
hearing loss. 


Audiologic studies suggest presump- 
tive evidence of cochleas injury in 
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these cases. Cavanagh (5) reported 
that one of her cases had a unilateral 
perceptive hearing loss. Rosen (40) 
found that one of his subjects reacted 
to testing in a way that ‘strongly sug- 
gested the presence of recruitment.’ 
Bentzen! stated that recruitment often 
is found in Hearing Station cases. A 
study of hearing impairment in athe- 
toids by Flottorp, Morley and Skatvedt 
(17) gives strong evidence of cochlear 
injury. All 10 of their cases (four, and 
possibly a fifth, were known to have 
been kernicteric from erythroblastosis) 
showed evidence of recruitment and 
had significantly reduced dynamic 
ranges? compared to that found in un- 
impaired ears. This latter result also 
is considered as an indication of coch- 
lear involvement. 


In summary, the literature reviewed 
by this writer gives no uniform in- 
formation with regard to the location, 
or locations, of the auditory injury in 
cases of perceptive hearing loss follow- 
ing erythroblastosis fetalis. 


‘Deaf’ or ‘Rh’ athetoids are en- 
countered regularly by workers in the 
fields of speech pathology and au- 
diology. As a group they have pre- 
sented many challenging problems of 
evaluation and education. It was largely 
because of interest in the problems 
presented by these children that the 
writer undertook this study. 


The question asked in this investi- 
gation was the following: Do tests of 
cochlear function, namely, tests of re- 
cruitment and dynamic range of hear- 


*From __ personal communication to the 
author in 1957 from O. Bentzen, Chief- 
Physician, Statens Horecentral (State Central 
Hearing Station), Aarhus, Denmark. 

*Range in db from sensitivity threshold to 
aural-harmonic threshold. 


ing, reveal any abnormal cochlear func- 
tion in subjects with a perceptive type 
of hearing loss who have a history of 
erythroblastosis fetalis? 


Clinical Tests. The phenomenon 
known as recruitment, according to 
Hirsh (27), is evidenced ‘when it can 
be shown that the loudness of a given 
tone increases more rapidly than nor- 
mal as the sensation level of the tone is 
increased in equal decibel steps.’ Thus, 
in spite of a deficiency at threshold, 
the loudness of a tone in an ear with 
recruitment seems to catch up with 
the normal ear. Through the work of 
Dix and Hood (11), Eby and Williams 
(15), and others, it has come to be ac- 
cepted that recruitment is a phenome- 
non indicative of cochlear lesion (45). 

The aural-harmonic test® also is sen- 
sitive to cochlear injury (33, 36). Aural 
harmonics are thought to arise from 
the sensory cells of the cochlea (44) 
when these cells are overloaded by a 
sound whose intensity is greater than 
that which the physical structure of 
the sensory cells can linearly repro- 
duce. The intensity at which distortion 
takes place is called the aural-harmonic 
threshold. The range between the 
threshold of hearing sensitivity and the 
aural-harmonic threshold is signifi- 
cantly reduced by cochlear injury 
(33, 36). 


Procedure 


The experimental subjects were 
20 hearing-defective erythroblastotics, 
ranging from seven to 23 years of age 
(mean age 12.6 years). Selection of 





’The term ‘aural overload’ is synonymous 
with ‘aural harmonic, but the latter term is 
favored now for the sake of consistent 
terminology. 
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subjects was based on pediatric and 
otologic examination, blood study and 
demonstrable perceptive hearing loss. 
Each subject was diagnosed by a pedia- 
trician as having been erythroblastotic, 
this diagnosis being based on case his- 
tory, medical records, Rh blood study 
of mother and subject (including the 
Indirect-Coombs Test, (7) and on a 
neurological examination. Although 
athetosis was not used as a criterion for 
selection of subjects, each of the sub- 
jects had this type of cerebral palsy 
as his major neurological involvement. 
An otologist examined the subjects to 
verify that a perceptive hearing loss 
did exist and to determine whether 
there was any hearing loss due to a 
conductive involvement. 


Air- and bone-conduction thresholds 
were obtained on all of the subjects 
by standard audiometric procedures at 
250, 500, 1000, 2000, 4000 and 8000 
cps. Each subject was then given a 
recruitment test using either the bi- 
naural or the monaural loudness-balance 
method, depending upon which method 
was most easily applied. A threshold 
difference of at least 20 db between 
alternate ears at the same frequency 
was required for the binaural method, 
while a difference of 25 db or greater 
was required for comparisons of dif- 
ferent frequencies in the same ear 
using the monaural method. These 
were virtually the same criteria 
adopted by Flottorp, Morley and Skat- 
vedt (17), who used 15 and 30 db, re- 
spectively, as minimal requirements. 
Correction norms were established from 
equal-loudness judgments made at 250, 
500, 2000 and 4000 by 12 normal sub- 
jects, using 1000 cps as a reference 
level. Judgments by the subjects were 
made at 40, 70 and 90 db. These cor- 
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Figure 1. In the audiograms shown above, 
the solid line represents the right ear and the 
broken line represents the left. These audio- 
grams pertain to the following (from top to 
bottom): A: the experimental subject with 
the most residual hearing; B: the statistical 
mean audiogram of all 20 experimental sub- 
jects; and C: the experimental subject with 
the least amount of residual hearing. 


‘rection norms were used in connection 


with the monaural loudness-balance test 
in which the subject is asked to balance 
the loudness of two tones of different 
frequency. Fletcher and Munson (16) 
have described the psychoacoustic ef- 
fect of such comparisons, indicating 
that frequency has a direct relationship 
to loudness judgments and that like in- 
tensities of different frequencies usually 
will not be judged equal in loudness. 
All experimental subjects previously 
had been tested in their own homes or 
schools by the examiner for air- and 
bone-conduction thresholds and for re- 
cruitment. This procedure gave the 
subjects preliminary practice on these 
tests and allowed the examiner to be- 
come familiar with responses of each 
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prior to the time when final data were 
collected. The subjects were given all 
the audiological tests prior to being ex- 
amined by the otologist and the pedia- 
trician in order to minimize any anxiety 
or emotional upset which might be re- 
lated to the latter examinations. The 
statistical mean for audiograms of the 
experimental subjects is shown in Fig- 
ure 1, accompanied by the audiograms 
of the subject with the most residual 
hearing and the subject with the least 
residual hearing. 

The Maico Aural-Overload Tester 
Model P-42 was used for the collection 
of aural-harmonic data. This appara- 
tus allowed testing of aural-harmonic 
thresholds at 1000 and 2000 cps up to 
a maximum intensity of 100 db. Since 
detection of harmonics in the ear is 
difficult, the probe-tone technique was 
used. In this procedure a second tone 
is delivered to the ear with the test 
tone. The probe tone is almost equal 
to the second harmonic of the funda- 
mental tone being tested (a difference 
of about 4 cps). When the subject’s 
ear begins to produce nonlinear dis- 
tortion of the fundamental (test tone), 
the second harmonic produced by the 
nonlinearity begins to beat with. the 
probe tone. It is the first detection of 
this beat above the audibility threshold 
for the particular tone which allows 
the examiner to determine the intensity 
at which the ear distorts, that is, the 
aural-harmonic threshold. 


When administering the aural-har- 
monic test, the examiner instructed all 
subjects that they would hear a low- 
pitched tone which would be joined 
immediately by a second tone of higher 
pitch, and that they were to indicate 
whether the two tones produced a 
‘steady’ or a ‘wavy’ sound. These two 


descriptive terms were demonstrated 
by a vertical stroke of the tester’s 
hand accompanying the word ‘steady’ 
and by an undulating movement of 
the hand accompanying the word 
‘wavy.’ To facilitate the responses of 
the subjects and further to clarify the 
instructions, an 18 x 24-inch reinforced 
poster board containing visual symbols 
was stabilized on the lap of the subject. 
The three widely-spaced symbols were 
representative of the terms ‘steady’ and 
‘wavy.’ A heavy vertical line was used 
for ‘steady’ and rapidly-undulating ver- 
tical line for ‘wavy’ when 2000 cps was 
being tested; a slowly-undulating ver- 
tical line was used for ‘wavy’ when 1000 
cps was being tested. Only one un- 
dulating line remained uncovered and 
visible during the testing. Each subject 
was asked to point to the line which 
represented what he heard. As a ver- 
ificativa procedure, the subjects pe- 
riodically were required to indicate 
whether the ‘waviness’ changed as the 
examiner varied the frequency of the 
exploring tone slightly or left it un- 
changed. A similar technique has been 
used by others (17). 

Since all previous norms established 
for the aural-harmonic test (33, 34, 
37) were for adults, and it was not 
known whether responses of children 
would be different, the writer collected 
norm data for normal-hearing children 
between 11 and 19 years of age, using 
at least eight children at each age level. 
The number of ears tested was 147 at 
1000 cps and 143 at 2000 cps. The data 
from the normals were treated by age 
groups (11-13, 14-16 and 17-19 years) 
to determine whether differences in 
dynamic range existed among the 
groups. The variations of the mean 
scores for each of the three groups, 
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and for the groups as a whole, were 
less than two decibels. The mean dy- 
namic range at 1000 cps was 54.6 db 
with a standard deviation of 6.6 while 
at 2000 cps the mean dynamic range 
was 67.7 db with a standard deviation 
of 8.5 db. These norms are comparable 
to those established by Lawrence and 
Blanchard (32) and Lawrence and 
Yantis (33). 

A caloric test, as described by Ko- 
brak (31), was included as part of the 
otologic examination. This is a test of 
vestibular function in which a con- 
tinuous flow of ice water is directed 
through a speculum into the external 
auditory meatus for 20 seconds. Venous 
blood from each subject and mother 
was examined at the University of 
Michigan Hospital Blood Bank (a) for 
Rh incompatibility between subject and 
mother and (b) for an Indirect-Coombs 
Test (7). The latter test was done on 
the blood to determine whether the 
mother still carried antibodies against 
the red blood cells of the subject. 

In 17 of the 20 subjects the Indirect- 
Coombs Test revealed positive anti- 
body titer in the mother’s blood. For 
two of the subjects antibodies were not 
demonstrable in the mother’s serum, 
although in both of these children 
there was definite Rh incompatibility 
with their respective mothers and the 
diagnosis was felt justifiable on this 
basis and on the classical clinical his- 
tory. For a third subject the mother’s 
blood type was O, R'h (or C) negative 
and that of the child A, R'h (or C) 
positive. It was clear that incompatibil- 
ity existed in both the ABO system and 
the R'h (or C) system between mother 
and child. ‘Vechnical difficulty _ pre- 
cluded the use of an Indirect-Coombs 
Test on this subject’s mother, 


Results 


Recruitment tests and aural-harmonic 
thresholds were obtained on all 20 ex- 
perimental subjects. However, on two 
subjects only one of the two frequen- 
cies was tested for dynamic range. One 
subject had no hearing at 2000 cps and 
another had a normal threshold of sen- 
sitivity at 1000 cps. 

All 20 subjects demonstrated the 
presence of recruitment. Complete re- 
cruitment occurred in 15 cases. Of the 
five remaining subjects, all but one 
demonstrated partial recruitment. Each 
of the five subjects ‘caught-up’ in 
loudness by 20 to 40 db at the par- 
ticular frequency being balanced and 
then leveled off 6 to 10 db more in- 
tense than the ‘control’ frequency. In 
four cases the lack of complete re- 
cruitment can be accounted for by 
the presence of a slight conductive 
hearing loss. The conductive loss, su- 
perimposed on the primary perceptive 
loss, was shown by a 15 to 25 db ‘bone- 


‘air gap’ on the audiogram and was veri- 


fied by otologic examination. The fifth 
case showed no conductive loss. Data 
obtained indicated recruitment was oc- 
curring, but the test was not completed. 
This subject was unusually sensitive to 
high intensities and complained so 
vigorously that recruitment testing was 
not carried beyond 80 db. Several 
other subjects demonstrated intolerance 
for loud sounds. This sort of response 
also is considered to be a sign of 
cochlear involvement. Watson and 
Tolan (47) point out the correlation 
between recruitment and such excep- 
tional intolerance for loud sounds. In 
spite of some sensitivity to loud sounds, 
10 of the subjects wore hearing aids. 

With recruitment demonstrated in 
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Taste 1. Aural-harmonic results showing means 
and standard deviations in the normal and ex- 
perimental groups. 











1000 cps 2000 cps 
Mean S.D, Mean S.D. 
(db) (db) (db) (db) 
Dynamic Range 
Normal Group 54.6 6.6 67.7 8.5 
Experimental 
Group 18.4 5.9 18.2 8.0 


ee Thresholds Above Audiometric 
Zero 


Normal Group 49.9 6.2 64.2 7.0 
Experimental 
Group 67.9 8.8 76.0 9.9 








all 20 subjects tested, it can be ex- 
pected that a high percentage of any 
comparable population would also dem- 
onstrate recruitment. By referring to 
a confidence belt for this proportion 
(12, p. 322) it is found that the pre- 
dictability for the occurrence of re- 
cruitment in a like population would be 
between 83% and 100%, with a con- 
fidence coefficient of .95. 

The range between the thresholds of 
sensitivity and the  aural-harmonic 
thresholds (dynamic range) at 1000 
and/or 2000 cps was significantly re- 
duced in each of the 20 experimental 
subjects when compared to normals 
(noted above). Table 1 shows the re- 
lationship between the results for the 
normal group and for the experimental 
subjects. At 1000 cps the mean dy- 
namic range for normals was 54.6 db 
with a range from 35 to 70 db and a 
standard deviation of 6.6 db. For the 
19 experimental subjects tested at 1000 
cps the mean dynamic range was 18.4 
db with a range of 10 to 30 db and 
a standard deviation of 5.9. A con- 
siderable reduction in dynamic ranges 
also was found at 2000 cps for the 19 
experimental subjects tested. This group 
had a mean dynamic range of 18.2 db 


with a range of 5 to 40 db and a stand- 
ard deviation of 8.0 db. The mean dy- 
namic range for the normal group at 
2000 cps was 67.7 db with a range of 
45 to 90 db and a standard deviation 
of 8.5 db. The mean dynamic range 
of the experimental subjects falls more 
than five standard deviations below the 
mean dynamic range of the normal 
group at 1000 and 2000 cps. 

Table 1 also shows the actual in- 
tensity levels for aural-harmonic thresh- 
olds in the normals and in the experi- 
mental subjects. With audiometric zero 
rather than the subject’s threshold as 
a base line, the mean aural-harmonic 
threshold for normal ears was 49.9 db 
at 1000 cps and 64.2 db at 2000 cps as 
compared with 67.9 db at 1000 cps and 
76 db at 2000 cps in the experimental 
group. These results compare very 
closely with those found by Lawrence 
ind Yantis (33). 

The caloric test was used to study 
the status of the non-auditory labyrinth 
and to compare its performance with 
that of the auditory labyrinth. Fourteen 
of the subjects, or 70%, | ape normal 
vestibular responses; 15% showed a 
normal response in one ear with a hy- 
poactive response in the other ear; an- 
other 10% responded hypoactively in 
both ears; and one subject, or 5%, had 
no vestibular response. If nystagmus 
was observed in one or two degrees 
(eyes looking away from the stimulated 
ear or looking forward) for one minute 
or less and not in the third degree 
(eyes looking toward the stimulated 
ear), the response was considered hy- 
poactive. The response was considered 
normal when nystagmus was observ- 
able in all three degrees for longer than 
one minute. There was no relationship 
between the severity of the hearing 
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loss and the caloric response; for ex- 
ample, the two subjects with the most 
severe hearing losses gave normal ca- 
loric responses. 

The subjects were grossly ranked 
for severity of athetosis by the pedia- 
trician on a four-point scale from ‘mini- 
mal’ to ‘marked.’ The severity of hear- 
ing loss for each subject was then com- 
pared with severity of athetosis by 
converting a numerical ranking for 
hearing loss to the same four-point 
scale used by the pediatrician. The 
numerical ranking of severity of hear- 
ing loss was achieved by summing the 
thresholds of each subject at every 
frequency tested and then listing these 
totals from lowest to highest. Eight 
subjects had hearing losses which were 
the same in severity or within one- 
half rank of the severity of athetosis. 
Nine subjects had hearing losses which 
were ranked at the opposite end of 
the scale of severity from the associated 
athetosis. The remaining three subjects 
showed a severity rank for hearing 
loss which was neither close to nor far 
from the severity rank given for their 
athetosis. 


Discussion 


The results of the recruitment and 
aural-harmonic threshold _ tests give 
strong evidence that the cochlea was 
damaged in the experimental subjects 
tested. With normal caloric responses 
in 77.5% of the ears tested, it would 
seem that the vestibular mechanism 
was, for the most part, functionally 
unimpaired in these subjects. Left to 
explain, at this point, is why the 
cochlea is involved. The results ob- 
tained in this study are contrary to 
most of what has been written about 
the hearing of the ‘Rh deaf’ child. 


There is also the problem of speculat- 
ing as to why the vestibular mech- 
anism, which is intimately connected 
with the cochlea, was not noticeably 
damaged. 


Most of the experimental subjects 
used in this study were multihandi- 
capped as the sequela of erythroblas- 
tosis. The effect which these handicaps 
had upon the audiologic testing pro- 
cedure is significant. One or more of 
the following kinds of behavior was 
common during the audiologic test- 
ing for most of the subjects: (a) dis- 
tractibility; (b) response lag; (c) per- 
severation; and (d) apparent rapid fa- 
tigue at pure-tone threshold. It was 
necessary for the examiner to be alert 
for the possible occurrence of such be- 
havior and to take it into consideration. 
The apparent rapid fatigue at thresh- 
old, of course, could not be altered, but 
the examiner’s perplexity was reduced 
by the recognition of this behavior. 
Many subjects responded to a tone at 
their apparent thresholds the first time 
it was presented and then did not re- 
spond to the same tone when it was 
presented again for verification. In fact, 
they leveled off at a threshold 5, 10 or 
even 15 db above the earlier threshold. 
This behavior may be a factor which 
has caused some authors (40) to refer 
to Rh athetoids as having a ‘fluctuating’ 
hearing loss. Even though the tone was 
presented to these subjects at thresh- 
old level, it must be remembered that 
many of these thresholds were con- 
siderably depressed and thus required 
an intensity which might quickly have 
led to fatigue. A connection between 
fatigue and loudness recruitment seems 
to be established both clinically and 
experimentally (24, 27). The behavior 
factors mentioned did not prevent the 
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obtaining of consistent audiologic re- 
sults, but they offer some reasons for 
conflicting opinions in the literature 
(25, 22, 37, 40) regarding the hearing 
status of Rh athetoids. 

One may be able to draw conclu- 
sions about a pathologic condition from 
the results of audiologic tests. How- 
ever, to attempt to explain on the basis 
of an audiologic test the process by 
which a pathologic condition developed 
is pure speculation. Nevertheless, some 
possible answers to these questions may 
be suggested. 

The question of why the cochlea is 
damaged in erythroblastosis and the 
vestibular apparatus probably is not 
can be answered hypothetically by re- 
ferring to analogous phenomena. In 
studies concerning the ototoxic effects 
of certain drugs, for example, strepto- 
mycin and dihydrostreptomycin (2, 
26), it has been found that cochlear in- 
volvement occurs more often than ves- 
tibular damage with one (streptomy- 
cin), while with another (dihydrostrep- 
tomycin) the opposite effect is found. 
The same puzzling findings occur in 
Meniere’s syndrome (10). The patient 
may have cochlear deafness, without 
vertigo, or he may have labyrinthine 
vertigo, without deafness. Studies of 
this disease point to a selective or dif- 
ferential sensitivity of the structures or 
of the physiologic processes involved. 

A review of the common causes for 
a cochlear hearing loss indicates that 
acoustic trauma (noise), toxic effects 
of disease and anoxia are those occur- 
ring most frequently. Although the 
possibility of a toxic effect from bili- 
rubin (35) cannot be overlooked, the 
subjects used in this study might have 
suffered from oxygen deprivation. 
Flottorp, Morley and Skatvedt (17) 


point out the possibility that, in eryth- 
roblastosis, intrauterine anemia and 
possible edematous enlargements of the 
placenta may cause fetal anoxia. The 
bilirubin, released by hemolysis, has 
been implicated as being etiologically 
significant in producing cerebral anoxia 
in in-vitro animal studies (9). If anoxia 
were a common phenomenon in these 
subjects, it would be reasonable to as- 
sume that cochlear injury occurred as 
a result. One does not have to search 
far in the literature to document the 
effects of oxygen deprivation upon the 
cochlea. Animal studies by Wever, 
Bray and Lawrence (42), Weaver et al. 
(43) and Bekesy (1) have established 
that oxygen deprivation can injure the 
cochlea permanently. Gisselsson (20) 
showed that 2 two-minute deprivation 
of oxygen supply to the cochlea of 
guinea pigs produced irreparable dam- 
age to that organ. Gulick (23) found 
destruction of hair cells in cats follow- 
ing oxygen deprivation. A recent study, 
which is of special interest because it 
refers to both cochlear and vestibular 
changes as a result of oxygen depriva- 
tion, is that by Kimura and Perlman 
(29, 30). This was an histological and 
pathological investigation in which the 
labyrinthine arteries of guinea pigs 
were surgically obstructed for various 
intervals of time. They reported pro- 
found changes in all structures of the 
cochlea. Changes in the hair cells with- 
in half an hour were followed in a 
few hours by involvement of the sup- 
porting cells. Vestibular injury was 
studied by caloric tests and verified by 
histologic observations. All degrees of 
response to the caloric test were noted, 
from total absence to normal. Changes 
in the vestibule were not as great as 
those in the cochlea. In about 20% 
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of the animals no vestibular lesions 
were found. The remainder showed 
various degrees of damage. A plausible 
explanation of such selective injury can 
be found in a statement by Gerard 
(18) that ‘. . . as a consequence of in- 
adequate oxygen supply, a differential 
injury can be expected in structures 
with different oxygen requirements.’ 


Summary 


Twenty hearing-defective erythro- 
blastotics (Rh athetoids) were given 
tests of cochlear function, namely, re- 
cruitment and aural-harmonic tests. 
Caloric tests of vestibular function also 
were given. Results demonstrated the 
presence of recruitment and a signifi- 
cantly reduced linear range of hearing 
in all the subjects. Responses to the 
caloric tests were normal in approxi- 
mately three-fourths of the ears tested. 


The results of this study warrant the 
following tentative conclusions: (a) 
the site of the injury in perceptive 
deafness associated with erythroblas- 
tosis is in the cochlea; (b) vestibular 
function, as evaluated by caloric test, is 
essentially normal; (c) severity of hear- 
ing loss in erythroblastosis cannot be 
predicted from the severity of the as- 
sociated athetosis. 
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Respiratory Muscles In Speech 


M. H. DRAPER 
PETER LADEFOGED 


D. WHITTERIDGE 


This study is concerned with the re- 
lationships between some acoustic fea- 
tures of speech and the following phys- 
iological variables: (a) the activity of 
the respiratory muscles, (b) the con- 
comitant variations in the pressure of 
the air in the lungs and (c) the varia- 
tions in the volume of air in the lungs. 
A first account of the experiments (1) 
and a discussion of some of the pho- 
netic implications (4) have been pub- 
lished elsewhere. 


Procedure 


The acoustic features were recorded 
with the aid of two microphones, each 
approximately 10 inches in front of 
the subject’s mouth. One microphone 
was connected to both a high quality 
tape recorder and a meter for meas- 
uring signal strength. The amplified 
output of the second microphone was 
displayed on a cathode-ray oscilloscope 
so that the wave forms of the utter- 
ances could be photographed simul- 
taneously with the other kinds of data 
to be described below. 
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Pressure Recording. The pressure of 
the air in the lungs or in the trachea 
immediately below the vocal cords can- 
not easily be measured directly. Ac- 
cordingly in these experiments record- 
ings were made of pressure variations 
in the oesophagus. Subjects swallowed 
a balloon which was connected by a 
plastic tube with a bore of 2 mm to a 
rubber tambour whose magnified ex- 
cursions were recorded in ink on a 
kymograph by means of a lever sys- 
tem. The balloon and about 34 cm of 
the tube were passed through the nose 
so that the balloon rested in the oesoph- 

‘agus slightly above the level of the 
bifurcation of the trachea. When the 
balloon was inflated it was 25 mm long 
and had a diameter of 15 mm. Thus it 
pressed against the flexible membrane 
which forms the posterior wall of the 
trachea. Any increase in the pressure 
of the air in the trachea which dis- 
tended this membrane caused a cor- 
responding increase in the pressure in 
the balloon. 

In order to check on the correlation 
between the pressure in the balloon and 
the tracheal pressure, two supplemen- 
tary experiments were carried out. In 
the first, each of three subjects swal- 
lowed an oesophageal balloon and the 
pressure was recorded with the subject 
steadily exhaling against a resistance. 
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Figure 1. Part of the display on two twin-beam CRO’s during the repetition of the syllable 
[ma]: (1) time marker, tenth seconds; (2) decreasing activity of the external intercostals 
(retouched); (3) microphone; (4) volume of air in the lungs; (5) increasing activity of the 


internal intercostals (retouched). 


The pressure in the mouth during this 
action also was recorded. This pressure 
should be the same as that in the 
trachea, provided that the rate of flow 
of air is slow and there are no con- 
strictions within the vocal tract or at 
the glottis. Judging by the sound of 
the air stream, there were no constric- 
tions during this subsidiary experiment. 

It was found that the pressure re- 
corded in the oesophagus was usually 
equivalent to the pressure of exhala- 
tion. But towards the end of a long 
expiration, when there was a small vol- 
ume of air in the lungs, there was often 
a tendency for the oesophageal pressure 
to increase. This tendency occurred 
also in the speech experiments to be de- 
scribed below. There were additional 
occasional discrepancies between the 
two records due to easily identified 
waves of muscular contraction pass- 
ing down the oesophagus. When these 
two sources of error had been taken 
into account, it was found that in each 
of the 48 observations which were 
made, the oesophageal pressure record 
provided a valid indication of the pres- 
sure of exhalation (difference between 
the two not significant at the 5% 
level). 

In the second supplementary experi- 
ment a hollow needle with an internal 
diameter of 1.5 mm was inserted in 


the trachea about 6 cm below the vocal 

cords. This experiment’ was performed 
on one subject only. Measurements 
were made of the pressures recorded 
during 11 utterances of the form used 
in the experiments to be described be- 
low. As in the previous subsidiary ex- 
periment, is was found that the oe- 

sophageal pressure is a good measure‘ 
of the tracheal pressure (difference 
not significant at the 5% level). Van 
den Berg (10) also has claimed that 
the oesophageal pressure can be re- 
garded as a valid measure of the tra- 
cheal pressure. 


Electromyography. The activity of 
the respiratory muscles used during 
speech was extensively studied by 
means of electromyography. An ele- 
mentary explanation of the use of this; 
technique in experiments on speech ha‘; 
been given elsewhere (4). In most of 
the experiments to be described, coi- 
centric-needle electrodes were insert:cd 
into the muscles which were being jn- 
vestigated. The electrodes were in- 
sulated externally by bakelite varn ish 
to the tip. The central insulated w) -e 
had a diameter of about 0.8 mm. h 
some experiments, surface electrodes 
consisting of silver plates about 5 mm 


*Mr. I. Harris, F.R.CS., acted as surgeon 
in this experiment. 
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in diameter were also used. In all ex- 
periments the recorded potential dif- 
ferences were amplified and displayed 
on two double-beam cathode-ray oscil- 
loscopes which could be photographed. 
An example of part of the recorded 
data is reproduced in Figure 1, which 
will be discussed later. Electromyogra- 
phy was used for the investigation of 
four muscles or groups of muscles: the 
external intercostals; the internal inter- 
costals; the latissimus dorsi, rectus 
abdominis, internal and external ob- 
liques; and the diaphragm. 
The external intercostals are thin 
‘sheets of muscle running between each 
rib. Their fibres run upwards and back- 
wards so that they may be thought of 
as the lower portion of a sheet of 
muscles linking the ribs to the fixated 
first rib, the vertebrae of the neck and 
the base of the skull. Their action is to 
lift the rib cage outwards and thus ex- 
pand the lungs. In investigating the 
action of these muscles, concentric- 
needle electrodes were usually inserted 
in the fifth intercostal space three 
inches from the midline, posteriorly. 
This area was chosen because it is the 
only place where the external inter- 
costal muscles can be found immedi- 
ately under the skin. When the arms 
are folded on the chest the shoulder 
‘olades move outwards and forwards 
and uncover the external intercostals. 

n this region, near the midline, internal 
intercostal muscles are absent. Surface 
electrodes placed on the skin in this 

‘ea also may be used to record the 
setivity of these muscles. 

Internal intercostals also lie between 
the ribs but are deeper than the ex- 
ternal intercostals. Their fibres run 
downwards and backwards and there- 
fore lie at an angle to those of the 


external intercostals. They may be 
thought of as a sheet of muscles link- 
ing the ribs to the pelvis through other 
muscles, for example, the abdominal in- 
ternal obliques. Their function is to 
pull the ribs down, reducing the size of 
the thoracic cavity and thereby increas- 
ing the pressure of the air in the lungs. 
In investigating the action of these 
muscles, concentric-needle electrodes 
were usually inserted near the mid- 
axillary line in the sixth or seventh 
interspace. The activity of these 
muscles cannot be investigated satis- 
factorily with surface electrodes. Rec- 
ords obtained in this way are very dif- 
ficult to interpret because of the prox- 
imity of the external intercostals as 
well as overlying thoracic and abdomi- 
nal muscles. 

Latissimus dorsi, rectus abdominis, 
internal and external obliques are 
muscles of the back and abdomen 
which are used mainly for muscular ac- 
tivity involving the arms and bending 
of the trunk. They can, however, be 
used to assist in diminishing the tho- 
racic cavity when extreme efforts are 
made to force air out of the lungs. 
Their activity was investigated with 
surface electrodes and _ concentric- 
needle electrodes. 

The diaphragm, a dome-shaped 
muscle, forms the base of the thoracic 
cavity. When it contracts, the thoracic 
cavity is enlarged inferiorly, thereby 
expanding the lungs. Because of its sit- 
uation in the body there is difficulty 
in inserting needle electrodes into. it 
and its activity cannot be studied with 
electrodes on the skin. Accordingly a 
different technique? was developed. 


*Dr. P. B. C. Matthews of the University 
Laboratory of Physiology, Oxford, suggested 
the approach employed. 
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Subjects swallowed a thin tube contain- 
ing three leads connected to three elec- 
trodes near the tip. Since the oesopha- 
gus passes through the diaphragm be- 
fore reaching the stomach, this tube 
could be adjusted so that the elec- 
trodes were placed in effect directly 
on the diaphragm. This proved a very 
useful way of recording the activity 
of this muscle. 

On some occasions simultaneous re- 
cordings were made from pairs of 
muscles, for example, the internal and 
external intercostals, as illustrated in 
Figure 1. But it was found convenient 
more often to record the activity of 
only one muscle at a time. ; 


Volume Recording. The remaining 
variable studied in the current series 
of experiments was the variation in the 
volume of air in the lungs. Subjects 
were sealed in a body plethysmograph 
(a rigid airtight container which in 
this case was two steel barrels welded 
together) in such a way that only the 
head and neck were exposed. When the 
subject inflated his lungs, air was dis- 
placed out of the barrel; when he 
breathed out, air flowed back in again. 
The displaced air moved a spirometer 
whose excursions were recorded in ink 
on a kymograph. This procedure gave 
an accurate quantitative record of the 
movement of air in and out of the lungs 
in respiration and speech, without the 
interference with the subject’s vocal 
processes which must occur if a mask 
is placed on his face. 

The movements of the spirometer 
also were electrically recorded on one 
of the cathode-ray oscilloscopes on 
which the electromyographic data were 
displayed (see Figure 1). The moment 
of maximum inspiration immediately 
before speaking (which is not shown 
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on Figure 1) was used as a zero point 
from which to measure intervals on 
both the kymograph and the cathode- 
ray oscilloscope recordings. Synchro- 
nization of the two kinds of recordings 
could also be checked by reference to 
a mark automatically made on the ky- 
mograph whenever the camera motor 
was switched on. On some occasions 
it was also possible to see some indica- 
tion of the separate words either on 
the volume or on the pressure record 
on the kymograph. This enabled cor- 
relations to be made with the amplified 
output of the microphone which was 
displayed on one of the cathode-ray 
oscilloscopes. 


Subjects. Data were obtained from a 
total of 18 subjects, all of whom were 
native speakers of English. The mus- 
cular activity involved in the speech 
of the five principal subjects, all of 
whom were male members of the staff 
of the University of Edinburgh, was 
recorded with the aid of concentric- 
needle and surface electrodes. Surface 
electrodes only were used with the 
other subjects, four women and seven 
men, all of whom were students or 
staff of the University of Edinburgh. 
Each of the principal subjects engaged 
in spontaneous conversation and also 
read several lists of words and a num- 
ber of short sentences. The results re- 
ported below are valid for all these 
subjects in so far as they relate to the 
correlations between the muscular ac- 
tivity, the wave form and such features 
of speech as can be assessed by the ear 
of a trained observer. For three sub- 
jects, simultaneous recordings were ob- 
tained of (a) the sound waves, regis- 
tered by a microphone; (b) the oesoph- 
ageal pressure, shown by the varia- 
tions of the pressure in a swallowed 
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balloon; (c) the muscular activity, re- 
corded by concentric-needle  elec- 
trodes; and (d) the volume of air in- 
volved in speech and respiration, shown 
by the volume of air displaced from 
the body plethysmograph. In these cir- 
cumstances, a total of 211 valid obser- 
vations was made of muscular activity. 
It is appreciated that observations of 
many more subjects need to be made. 
However, the results obtained so far 
are sufficiently consistent to suggest 
the general pattern of the relationships 
involved. 


Results and Discussion 


During the pronunciation of a con- 
tinuous utterance consisting mainly of 
equally stressed syllables (such as 
counting or repeating a single stressed 
syllable), the mean level of the pres- 
sure of the air in the lungs was re- 
markably constant; but superimposed 
on this mean steady increase in pres- 
sure were numerous small fluctuations. 


The mean level of the pressure of 
the air in the lungs depended on the 
loudness with which the speaker was 
trying to talk. When he talked quietly, 
the mean increase was about 2 cm of 
water; when he talked loudly, the 
mean increase was about 5 cm of 
water; when he shouted, the mean in- 
crease was at least 9 cm of water. 
Sometimes, in the loudest shouting, the 
increase was more than 30 cm of 
water. 


In one series of experiments, subjects 
were instructed to count from one to 
20, making the words equally loud. 
The mean pressure remained fairly con- 
stant as long as the subject, in his own 
opinion and in the opinion of the listen- 
ing investigators, succeeded in follow- 


ing these instructions. But in these cir- 
cumstances the intensity (that is, the 
amount of acoustic energy of the dif- 
ferent words varied considerably. Thus 
on several occassions during counting 
from one to 20, the peak intensity in 
some of the words, three for example, 
was 10 db to 15 db less than in some 
of the others, such as nine. But subjects 
were not usually aware of these differ- 
ences and sometimes even denied the 
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Ficure 2. Upper part of figure, a reproduc- 
tion of a record of the variations in the 
volume of air in the lung and the oesophageal 
ressure during respiration and speech (count- 
ing from one to 32 at a conversational 
loudness). Lower part of figure, a diagram- 
matic representation of the muscular activity 
which was observed to accompany such 
ressure and volume changes. The dashed 
fine which has been superimposed on the 
pressure record indicates the relaxation pres- 
sure associated with the corresponding volume 
of air in the lungs. It is equal to zero when 
the amount of air in the lungs is the same 
as that at the end of a normal breath. The 
arrows indicate the moment when the relax- 
ation pressure is no longer greater than the 
mean pressure below the vocal cords. At 
this moment the external intercostal activity 
ceases, and that of the internal intercostals 
commences. 
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possibility of their having occurred. 
Differences in intensity of this magni- 
tude can normally be perceived as dif- 
ferences in loudness in anything other 
than speech sounds. It seems, there- 
fore, that naive listeners, obeying an 
instruction to consider the loudness of 
sounds in continuous speech, do not 
assess the acoustic properties of the 
sounds but consider, instead, the pres- 
sure which would be required below 
the vocal cords. In other words, the 
term ‘loudness’ is probably generally 
used in one way in discussions about 
speech sounds and in quite a different 
way in discussions of other sounds. 

A typical recording showing the re- 
lation between the volume of air in 
the lungs and the pressure recorded in 
the oesophagus is reproduced in the 
upper part of Figure 2. This shows, 
first, a normal breath of a little more 
than half a litre; second, a deeper in- 
spiration as the subject prepares to 
speak; and third, a decrease in volume 
during the utterance (counting from 
one to 32 at conversational loudness). 
During the utterance there is an in- 
crease of about 3 cm of water in the 
mean level of the oesophageal pressure; 
after the utterance the oesophageal 
pressure returns to the previous mean 
level and respiration continues. Small 
fluctuations, approximately one for each 
stressed syllable, can be seen on the 
oesophageal pressure record during the 
utterance. 

Before the patterns of muscular ac- 
tivity that accompany these pressure 
and volume variations can be discussed, 
it is useful to consider the factors 
which may affect the pressure of the 
air below the vocal cords. (a) The 
pressure will be decreased by an in- 
spiratory muscular effort, such as lift- 
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ing the rib cage by means of the ex- 
ternal intercostals or contracting the 
diaphragm. These are both actions 
which enlarge the thoracic cavity. (b) 
The pressure will be increased during 
an expiratory muscular effort which 
could involve such muscles as the in- 
ternal intercostals, the external obliques 
or rectus abdominis, all of which can 
function so as to decrease the size of 
the thoracic cavity. (c) The pressure’ 
will be affected by the resistance to 
the air stream at the glottis or else- 
where in the vocal tract. But any 
variation in the amount of this resist- 
ance will affect not only the pressure 
below the vocal cords but also the rate 
of flow of air out of the lungs. In the 
utterance illustrated in Figure 2, and 
in all the utterances to be considered in 
this article, the mean rate of flow dur- 
ing a number of consecutive words 
shows little variation. Consequently the 
mean resistance cannot have varied dur- 
ing the course of the utterance. (The 
rate of flow of air during a single word 
is, of course, far from constant; and the 
concomitant variations in the amount 
of resistance may have an effect on the 
pressure of the air below the vocal 
cords. But this will not affect the mean 
or background pressure measured over 
a period of several seconds.) (d) The 
final factor affecting the tracheal pres- 
sure is the relaxation pressure (6), that 
is, the sum of the forces from the ab- 
domen and the forces exerted by 
stretched lung tissues and the elastic 
structures of the rib cage. The lungs 
consist of a number of air chambers 
contained within elastic membranes 
which may be likened to toy balloons: 
when they are inflated they have a 
tendency to collapse; and the larger 
the volume of air inside them, the larg- 








22 Journal of Speech and Hearing Research 


er the pressure of that air. After a 
maximal inspiration, when the rib cage 
has been fully raised and the lungs ex- 
panded so that the elastic membranes 
are considerably stretched, the relaxa- 
tion pressure may be very large, more 
than 30 cm of water; but after a nor- 
mal inspiration, such as occurs in quiet 
breathing, the relaxation pressure will 
be only about 5 cm of water. In this 
connection is should be remembered 
that when the diaphragm contracts, the 
pressure in the lungs is decreased; but, 
because the abdomen is pushed down, 
the pressure below the diaphragm is 
increased. When the diaphragm re- 
laxes, the abdominal pressure pushes 
the diaphragm up again. 

The four factors affecting the pres- 
sure of the air below the vocal cords 
may be considered by an analogy with 
a pair of bellows which has (a) a mech- 
anism to pull the handles apart, cor- 
responding to the inspiratory activity 
of the diaphragm and the external in- 
tercostals,; (b) an opposing mechanism 
which will pull the handles together, 
corresponding to the expiratory activ- 
ity of the internal intercostals and oth- 
er muscles; (c) a variable orifice, cor- 
responding to variations in the con- 
strictions at the glottis and in the vocal 
tract; and (d) a spring between the 
handles, corresponding to the relaxation 
pressure, which will exert a consider- 
able force on the handles when they 
have been pulled wide apart but which 
will exert less and less force as the 
handles come together. 

The muscles regulating the air pres- 
sure during many utterances have to 
be operated in such a way as to main- 
tain a constant mean background pres- 
sure in the lungs, despite a steady de- 
crease in the relaxation pressure or, in 


terms of the analogy, the pull of the 
spring. This may be done in various 
ways. If, after a deep inspiration, the 
relaxation pressure is much greater than 
the pressure required below the vocal 
cords, then inspiratory muscles are 
used to decrease the pressure in the 
lungs. As the volume of air in the 
lungs decreases, and thus the relaxation 
pressure becomes less, the inspiratory 
muscles usually cease acting and the 
pressure necessary for speech is main- 
tained by bringing expiratory muscles 
into action. Towards the end of a long 
utterance, when the volume of air in 
the lungs is very small, a large number 
of expiratory muscles may be needed 
to keep the mean pressure constant. 
This pattern of activity was nearly 
always observed in the subject who 
produced the utterance shown in Fig- 
ure 2. The lower part of this figure 
is a diagramatic representation of the 
muscular activity which was typical of 
his manner of using the respiratory 
muscles to maintain a steady mean pres- 


* sure. During the first part of an ut- 


terance, beginning after a deep in- 
spiration and maintaining a mean pres- 
sure of about 3 cm of water, the ex- 
ternal intercostals remain in action, 
regulating the pressure of the air below 
the vocal cords by checking the de- 
scent of the rib cage. As the volume 
of air in the lungs decreases, the ac- 
tion of the external intercostals di- 
minishes and eventually ceases alto- 
gether when the volume of air in the 
lungs is slightly less than the volume 
after a normal inspiration. From this 
moment on, expiratory activity is 
needed in order to maintain the pres- 
sure below the vocal cords and accord- 
ingly the internal intercostals come into 
action with gradually increasing inten- 
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sity. When the volume of air in the 
lungs is a little below that at the end 
of a normal expiration, the action of 
the internal intercostals is supplemented 
by various other muscles, such as the 
external obliques, rectus abdominis and 
latissimus dorsi. 

The dashed line in the pressure curve 
in Figure 2 indicates the relaxation pres- 
sure which would be produced by the 
forces acting on the air in the lungs 
in the absence of any muscular action. 
At the beginning of the utterance it is 
about 16 cm of water, and it comes 


‘down to zero when the volume of air 


in the lungs is the same as that at the 
end of an expiration in normal quiet 
breathing. It may be seen that the 
external intercostals provide a checking 
inspiratory action as long as the re- 
laxation pressure is higher than the re- 
quired tracheal pressure. 

The change-over from the use of 
one set of muscles, the external inter- 
costals, to the use of another set, the 
internal intercostals, may be seen in 
Figure 1. In this case the whole utter- 
ance consisted of about 20 repetitions 
of the single stressed syllable [ma]; but 
only that part of the recording has 
been reproduced which shows dimin- 
ishing activity of the external inter- 
costals (top trace) followed by increas- 
ing activity of the internal intercostals 
(bottom trace). On both these traces 
the large waves, recurring at regular 
intervals of almost two per second, are 
due to the electrical activity of the 
heart and are, of course, irrelevant to 
the present observations. The action 
potentials are the somewhat smaller 
vertical spikes. The amplification of the 
activity recorded from the external in- 
tercostals was, on this occasion, slightly 
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greater than that recorded from the in- 
ternal intercostals. 

The external intercostals are the 
muscles principally used to check the 
descent of the rib cage. The diaphragm, 
since it is an inspiratory muscle, might 
be expected, on the basis of the mainte- 
nance of constant mean pressure, also 
to operate in speech as a checking 
muscle when the relaxation pressure is 
more than is needed for a particular 
utterance. But this does not usually 
happen. The diaphragmatic activity of 
11 subjects has been recorded. In the 
case of nine of these subjects, the in- 
spiratory activity of the diaphragm di- 
minished rapidly, ceasing completely 
during the first two or three seconds 
of an utterance after a maximal in- 
spiration. The action of the external in- 
tercostals was not recorded at the same 
time but other observations indicate 
that in such utterances the external in- 
tercostals are in operation for consider- 
ably longer. Thus it appears that the 
diaphragm did not play a significant 
part in the speech of these nine sub- 
jects. The other two subjects (one of 
whom was one of the three subjects 
whose muscular activity was recorded 
while in the body plethysmograph) 
maintained their diaphragms in action 
not only during the first part of utter- 
ances, when the relaxation pressure 
was high, but also when there was a 
smaller volume of air in the lungs. 
They used increased activity of ex- 
piratory muscles to offset the appar- 
ently unnecessary or excessive dia- 
phragmatic action. 


In the case of speakers who do not 
use the diaphragm during speech, the 
muscular activity depends mainly on 
(a) the mean pressure below the vocal 
cords and (b) the amount of air in 
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Ficure 3. Observed activity of a number of 
groups of muscles at various pressures and 
volumes. Each horizontal line indicates a 
range of volumes over which muscular activ- 
ity continued while maintaining the corres- 
ponding pressure. The dashed lines indicate 
the limits of muscular activity which might 
be found if it were possible to make a 
systematic investigation of each muscle by 
making a very large number of observations. 


the lungs. The relation between the 
muscular activity and these variables 
can be shown graphically. Figure 3 pre- 
sents a number of graphs to show the 
results of a series of observations of the 
subject whose speech is illustrated in 
Figure 2. The horizontal axis on each 
graph represents the volume of air in 
the lungs; the vertical axis shows the 
oesophageal pressure. Readings, from 
left to right, represent the decrease in 
the amount of air in the subject’s lungs 
which occurs while he is talking in such 
a way that the mean pressure remains 
relatively constant. The first graph 
shows two groups of lines, the one in- 
dicating the pressures and volumes dur- 
ing which the external intercostals were 
observed to be in action, the other in- 
dicating the pressures and volumes dur- 


ing which the latissimus dorsi was ob- 
served to be in action. The second, 
third and fourth graphs shows the pres- 
sures and volumes during which the in- 
ternal intercostals, the external obliques 
and rectus abdominis were observed to 
be in action. 

It is clear from these graphs that 
speech in which the mean pressure 
below the vocal cords is comparatively 
high involves earlier and greater ac- 
tivity of the expiratory muscles than 
does speech produced with a lower 
subglottal pressure. In speech such as 
loud talking or shouting, the checking 
action of the external intercostals oc- 
curs for short periods only after a 
very deep inspiration. The internal in- 
tercostals come into action before the 
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Ficure 4. Data of Figure 2 reduced to a 


schematic form which enables predictions to 
be made concerning which muscles will. be 
active when talking at various pressures or 
‘loudness’ levels and with different volumes 
of air in the lungs. Since in most utterances 
the mean pressure remains fairly constant 
while the volume of air in the lungs dimin- 
ishes, the pattern of muscular activity when 
talking at any given loudness can be deter- 
mined by following an appropriate horizontal 
line from left to right. The pressures and 
volumes found in most normal conversational 
English are enclosed in a rectangle labelled 
‘conversation.’ 
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volume of air in the lungs has de- 
creased to that associated with quiet 
breathing, and the expiratory actions of 
the other muscles also begin earlier. 
Quiet speech, on the other hand, in- 
volves more prolonged use of the ex- 
ternal intercostals, and, unless such 
speech is continued until there is a 
comparatively small amount of air in 
the lungs, muscles such as the rectus 
abdominis-and the latissimus dorsi will 
not be involved at all. 


When the data shown in Figure 3 
are reduced to a schematic form as in 
Figure 4, it is possible to see approxi- 
mately which respiratory muscles will 
be involved in all kinds of speech ac- 
tivity in which the diaphragm is not in 
action. Most of the range of pressures 
and volumes which are shown in Figure 
4 occur only in certain kinds of speech 
activity which are comparatively rare. 
The pressures and volumes which are 
typical of normal conversation are en- 
closed within heavy lines; within these 
limits the internal intercostals play the 
major part. 

At the moment, the general validity 
of Figure 4 may be assessed only in 
terms of the data which were obtained 
from two of the three subjects whose 
muscular behaviour was investigated at 
the same time the pressure and volume 
of air in the lungs were recorded. (The 
third subject who was studied in de- 
tail used both expiratory muscles and 
an inspiratory action of the diaphragm 
simultaneously; consequently Figure 4 
does not apply to him.) The variations 
in the behaviour of the first subject 
may be seen from the graphs in Figure 
3. The second subject fits the idealised 
scheme of Figure 4 to much the same 
extent. Despite the few subjects who 
can be studied in the intensive way and 


Whitteridge: Respiratory Muscles 25 


by the somewhat uncomfortable pro- 
cedure needed to obtain the data for the 
scheme shown in Figure 4, the nature 
and number of the results are such 
that the general plan of respiratory 
muscular co-ordination during speech 
seems clear enough. 

The quantitative results reported 
above are also in accord with further 
electromyographic records (without si- 
multaneous pressure and volume rec- 
ords) which have been obtained of the 
internal intercostals in three more sub- 
jects. These records show that these 
three subjects likewise always used 
the internal intercostals except when 
talking very quietly after a deep in- 
spiration and during all other kinds of 
speech the amount of internal inter- 
costal activity recorded increased as 
the volume of air in the lungs became 
less. 

The activity of rectus abdominis was 
studied also in five more subjects with- 
out simultaneous recording of the pres- 
sure and volume of air in the lungs be- 
ing made. It was observed that during 
normal conversational speech all these 
subjects used this muscle only towards 
the end of a long utterance. 


It was often noticeable that the ac- 
tivity of the internal intercostals, and 
occasionally that of some of the other 
muscles, did not increase uniformly as 
the volume of air in the lungs became 
less. Instead, bursts of activity were fre- 
quently separated by moments of com- 
parative quiescence, somewhat in the 
way suggested by Stetson (9). In an 
article (4) discussing the phonetic and 
phonological implications of these vari- 
ations in muscular tension, it has been 
shown that the bursts of muscular ac- 
tivity may be correlated with the per- 
ceived stress of the utterance. Similar 
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observations have been made independ- 
ently by Fonagy (2). 

There is an extensive literature on 
the action of the respiratory muscles in 
speech, the early part of which has 
been discussed by Wiskell (17). How- 
ever, many of the investigators, for ex- 
ample, Stetson (9), carried out their 
experiments apparently without realis- 
ing the importance of the relation be- 
tween the respiratory muscles and the 
relaxation pressure. An exception is 
Roos (8), whose knowledge of the re- 
laxation pressure and the _pressure- 
volume relations was derived from the 
excellent but now almost forgotten 
paper of Rohrer (7). The concepts es- 
tablished by Rohrer were applied in a 
valid manner by Roos to singing, talk- 
ing and playing wind instruments. But 
Roos made no special study of muscu- 
lar activity in speech and he concluded 
that in flute playing, at least, rapid 
fluctuations in pressure were due en- 
tirely to contractions of the muscles 
of the lips. He believed, without ex- 
perimental basis, that the muscles of 
the chest could not contract fast 
enough for this. 

In considering the muscles respon- 
sible for slowing up expiration, Roos 
suggested that the diaphragm was the 
principal muscle concerned. He based 
his view on Jagic and Lipiner’s (3) 
observations on diaphragm movement 
in wind instrument players. These ob- 
servations by x rays on the movements 
of the diaphragm shadow are difficult 
to interpret and carry little conviction. 

In the literature on speech and sing- 
ing there is a great deal written on 
the Atemstiitze or breath hold, a sub- 
ject well reviewed by Luchsinger (5) 
who said (in German): 


All authors are agreed that we must 
regard the hold (Stiitze) as the regulator 
of the air supply, the singer feeling during 
expiration a ‘so-called inspiratory tension’ 
. . . According to R. Schilling two kinds 
of ‘Stiitze’ are to be distinguished, depend- 
ing on whether the forces accumulated 
as a result of a maximal inspiration are 
initially released by the thoracic muscles 
or by the diaphragm. In the first case the 
chest wall falls while the diaphragm re- 
mains in its inspiratory position for an 
appreciable time (up to 8 sec). In the 
second case the thorax is held firmly in 
the inspiratory position while the dia- 
phragm rises. This occurs especially in 
legato and staccato singing. 


The pattern of respiratory muscular 
activity in trained singers was not in- 
vestigated in the current series of ex- 
periments. But observations of those 
who have received no training show 
that the relaxation pressure is usually 
opposed by the external intercostal 
muscles rather than by the diaphragm. 
No attempt was made to asses the rela- 
tive efficiency of the various different 
ways of using the respiratory muscles 
to adjust the subglottal pressure during 
speech. The simultaneous use of both 
inspiratory and expiratory muscles may 
give a greater control of the pressure 
and hence of the utterance. But it is 
possibly significant that those subjects 
who were practiced lecturers did not 
maintain the diaphragm in a state of 
tension while using other muscles to 
decrease the volume of the thoracic 
cavity. All practiced lecturers quickly 
learnt to use their diaphragms in this 
way; but none of them felt comfort- 
able when talking while maintaining 
the diaphragm in action. 


Summary 


The action of some of the respira- 
tory muscles during speech was in- 
vestigated by means of electromyogra- 
phy. Simultaneous recordings were 
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made also of the oesophageal pressure 
(which is shown. to be a valid indica- 
tion of the mean subglottal pressure), 
the volume of air in the lungs and the 
wave form of utterances. These in- 
dicated the change-over from the use 
of one muscle to another as volume 
and pressure changed. Quantitative ob- 
servations of this pattern of activity 
are given. Data are reduced to a sche- 
matic form from which predictions 
may be made concerning which 
muscles will be active when a subject 
is talking at various pressures or loud- 
ness levels and with different volumes 
of air in the lungs. 
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Electrophysiologic Responses To Sound 
As A Function Of Intensity, 


EEG Pattern And Sex 
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The accuracy and objectivity of 
thresholds for electrodermal and elec- 
troencephalic responses (EDR and 
EER) to sound are restricted by the 
limited knowledge of: (a) the increase 
of responsiveness with increases in in- 
tensity of stimulation and (b) the re- 
lation of responsiveness to the psy- 
chophysiologic characteristics of the 
individual under test. 

Several investigators (4, 7, 9, 10) 
have studied the relation between the 
amplitude of electrodermal responses 
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and intensity of stimulus. At present, 
however, there is no way of knowing 
from the amplitude of the EDR how 
close to threshold the auditory stimu- 
lus was. Furthermore, the measurement 
of the amplitude of electroencephalic 
responses to sound is impractical clin- 
ically. An EER may be either a re- 
duction or an increase in the amplitude 
of the existing electric activity, perhaps 
a new but transient activity, or else a 


: complete change in the basic activity. 


In electrophysiologic tests of hearing 
and in the more conventional psycho- 
physical tests of hearing, what is re- 
ported is whether or not a person 
responded, not how large a response 
he gave. The amplitude of any activity 
may help a tester decide whether that 
particular activity (such as EDR, EER, 
movement of a finger) is a response to 
a stimulus; but it is from the proportion 
of responses as a function of intensity 
that auditory thresholds are estimated. 

Berry and Martin (J) found that 
conditioning of electrodermal responses 
to a sound of fixed intensity varied 
with the pretest instructions, and that 
the effect of these instructions on men 
was different from the effect on 
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women. Charan and Goldstein (2) 
found that it was more difficult to elicit 
conditioned electrodermal responses to 
a sound of fixed intensity from men 
with a dominant alpha rhythm in their 
EEG than from men whose EEG con- 
tained little or no alpha rhythm. This 
finding did not obtain for women. 

The purposes of the present investi- 
gation were (a) to study the proportion 
of EDR and EER as a function of in- 
tensity close to the threshold of hear- 
ing, “(b) to learn how the percentage 
of responses varies with sex and with 
the pattern of the EEG, and (c) to 
derive a criterion for the estimation 
of threshold from the percentage of 
EDR and EER as a function _of 
intensity. 


Procedure 


Subjects. The subjects for this study 
were 22 men and 14 women between 
the ages of 17 and 40 with normal 
hearing. The subjects knew nothing 
about the procedures before they were 
tested and they were instructed not to 
discuss the procedures with anyone 
after the test was finished. 

Apparatus. The apparatus used in 
this study was the same as that pre- 
viously described by Goldstein, Lud- 
wig and Naunton (6), with the modi- 
fications described by Charan and 
Goldstein (2). 

In brief, the electrodermogram 
(EDG) was recorded on one channel 
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Ficure 1. Sample electroencephalograms (EEG) and electrodermograms (EDG). In A there 
is an electroencephalic response (EER) and an electrodermal response (EDR); in B only an 
EER; in C only an EDR; and in D neither an EER nor EDR. The EDR in B and D, following 
TONE OFF by approximately 2 to 2.5 sec, are to shocks which were given shortly before the 
end of the tones. ‘The abrupt change in the EDG in A, approximately 4 sec after the end of 
the tone, resulted from readjustment of the Wheatstone bridge by the operator. 
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Taste 1. Sample schedule of stimuli. 











S Ear SL (db) Shock S Ear SL (db) Shock 
1 R +10 + 21 L +10 sk 
2 L - 6 82 R 0 
3 R 0 33 CONTROL - 

4 CONTROL = 34 L - 6 
5 L a aT a 35 R - § 
6 CONTROL - 3 L 0 
7 R - 5 37 CONTROL - 
8 L 0 38 R +10 
9 L +10 + 39 R + 5 a5 

10 R + 5 40 L =o ot 

11 R - § 41 CONTROL - 

12 L +10 at 42 L - 6 

13 L 0 43 R - § 

14 R +10 te 4A L + 15 

15 L - § 45 CONTROL - 

16 R a 46 L +10 he 

17 CONTROL - 47 R 0 

18 L 20 cs 48 R +10 ot 

19 R 0 49 R + 5 Sf 

20 CONTROL - 50 L 0 

21 R + 5 rae 51 L + 5 ote 

22 CONTROL = 52 L - 5 

23 L - 5 53 R - § 

24 R +10 + 54 R + 5 a 

25 R - 5 55 CONTROL = 

26 L ane = 56 CONTROL = 

27 R 0 57 L +10 ss 

28 CONTROL = 58 R +10 

29 L 0 59 R 0 

30 L +10 60 L 0 








of a Grass 4-channel electroencephalo- 


graph (see Figure 1). Marks also were 
recorded on this channel indicating 
the onset and the end of the tones and 
shocks. The EEG was recorded on the 
remaining three channels of the instru- 
ment when the EEG and EDG were 
recorded simultaneously. At the be- 
ginning and at the end of the test, the 
EEG was recorded on all four channels. 

The subject reclined in a comfortable 
chair in a darkened room while the 
tester was in an adjoining room with 
the stimulating and recording equip- 
ment. A talk-back system permitted 
communication between subject and 
tester. 


Preparation of the Subject. For the 
EEG, silver-disc electrodes were ap- 


plied to the scalp with bentonite paste 
over a surface previously cleaned with 
alcohol and rubbed with abrasive jelly. 
The electrodes were held in place with 
adhesive tape. The subject was ground- 
ed by electrodes attached to his ears. 
Figure 1 shows the placement of the 
eight recording EEG electrodes. 

The electrodes for the EDG were 
zinc plates coated with a zinc sulfate- 
kaolin paste. These were taped to the 
fingertips of the left hand. Shock- 
electrodes were taped to the calf of 
the left leg. 


EEG Control-Recording. The sub- 
ject was told to relax and to close his 
eyes, and that he would receive neither 
tones nor shocks during this portion 
of the test. Then for about six minutes 


es 
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the EEG was recorded on all four 
channels from various electrode com- 
binations. 


Placement of Earphones. Earphones 
were placed over the ears with the 
yoke of the headset across the top of 
the head and resting over the centrally 
placed electrodes. 


Preparation for Simultaneous Re- 
cording. The EDG electrodes were 
substituted for the EEG electrodes on 
the fourth channel of the EEG machine 
by means of a simple switching circuit. 
The other three channels continued 
to record the EEG, usually from the 
combinations shown in Figure 1. 


Determination of Strength of Shock. 
Shocks were given first at subliminal 
intensities and then increased until the 
subject reported feeling something on 
his leg. The shock was then increased 
until the subject said that it was very 
annoying and that he would not want 
to have it any stronger. This strength 
of shock was used throughout the test. 


Determination of Threshold by Ver- 
bal Responses. Threshold was deter- 
mined for both ears of each subject 
for 1000 cps by having the subject 
respond verbally each time a tone was 
heard. A descending series was used five 
times for each ear; threshold was con- 
sidered to be the lowest level to which 
the subject responded at least three 
times. 


EDR-Audio and EER-Audio. A 
schedule of stimuli for the EDR-Audio 
and EER-Audio (5), performed simul- 
taneously, is shown in Table 1. Within 
each series there were six randomiza- 
tions of eight separate stimulus-condi- 
tions. Four different sensation levels 
(SL) were used: +10, +5, 0 and —5 


db with respect to the previously de- 
termined thresholds for each ear. The 
stimuli were distributed randomly with 
respect to sensation level and ear. In- 
termingled with the eight stimuli just 
described were two control-intervals, 
that is, two times in each subseries of 
eight stimuli a stimulus-mark was re- 
corded on the fourth channel without 
any auditory stimulus presented to the 
subject. Shocks (4.5 seconds after the 
onset of the tone) also were distributed 
randomly in. each subseries. Shocks 
followed only the stimuli which were 
at +10 or +5 db SL and only after 
three of four such stimuli in each sub- 
series. 

During the test, only the number 
from the schedule was placed alongside 
of the stimulus-mark. Consequently, 
when a decision was made later as to 
whether there had been a significant 
change in the EDG or EEG following 
the stimulus-mark, there was no indica- 
tion of which ear had been stimulated, 
what SL of sound had been used, or 
whether there had been any stimulus at 
all. Two randomizations of the same 
stimuli were used to prevent inadvert- 
ent memorization of a single schedule 
as analysis of the records continued. 


EEG Control-Recording. When the 
test was finished, the headphones and 
shock-electrodes were removed. The 
subject was told that there would be 
no more tones or shocks and was asked 
to relax. The EEG was then recorded 
on all four channels. 

The entire procedure from the 
placement to the removal of the elec- 
trodes took about an hour and a half. 


Analysis of Records 


Analysis of Responses, EDR. A 
change in resistance was judged to be 
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an electrodermal response primarily on 
the basis of latency. Almost all of the 
responses fell within a range of 1.5 to 
3.0 sec following the onset of a tone 
(see Figure 1). The amplitude of the 
responses as well as the rate at which 
they increased to their maxima usually 
exceeded the amplitude and rate of the 
random fluctuations in resistance and 
thus were helpful in distinguishing re- 
sponses to tones from random changes 
in resistance. 


Analysis of Responses, EER. The 
electroencephalic changes were varied 
with respect to latency, magnitude and 
kind. Suppression of existing activity 
(Figure 1) was the most common re- 
action noted. Sometimes, however, the 
response was an increase in the ampli- 
tude of the existing pattern or was the 
introduction of a new pattern. Fre- 
quently, the reaction to sound was a 
slow wave following the onset of the 
tone by 0.2 or 0.3 sec. In the present 
study, a judgment of response depend- 
ed only upon how different from the. 
preceding pattern (and sometimes from 
the subsequent pattern) the EEG ap- 
peared in the five-second interval 


during which the tone was presented. 

Changes in the EEG following ces- 
sation of the tone were ignored. These 
off-responses were often quite distinct, 
but for the tones which were followed 
by shock the off-responses in the EEG 
could have been attributed as much (or 
more) to the effect of the shock as to 
the effect of cessation of the sound. 
To know definitely when a shock had 
been given would have necessitated 
referring to the schedule of stimuli 
while making judgments of responses. 
Because any knowledge of the stimuli 
was avoided during analysis, no refer- 
ence was made to the schedule and, 
therefore, the off-responses to all stim- 
uli were ignored. 

Although not all responses could be 
judged with equal certainty, a simple 
yes or no decision was finally made 
in each case. — 


Classification of EEG Patterns. Three 
samples were taken from identical por- 
tions of the EEG record of each sub- 
ject. The 22 men were ranked accord- 
ing to the prominence of the alpha 
rhythm in these samples. The 14 
women were similarly ranked. Because 


TasiE 2. Proportions of electrophysiologic responses (EDR and EER) to sound as a function of Sensa- 














tion Level, 

Sex Measure Alpha Level Control SL (db) Mean 
-5 0 +5 +410 
High .02 0S .11 .27 .86 17 
EDR Low .03 .09 .85 .61  .68 .35 
Mean .02 08 .23 .44 = .52 .26 
Men 

High .07 19 .28 .89 648 .26 
EER Low 01 06 .18 .27 = .44 .18 
Mean .04 12 .18 .88 ~~ (£44 .22 
High .00 01 .13 §=©.36—.38 .18 
EDR Low 01 02 .17 80 = .82 .16 
Mean .O1 02 .15 .83~ = .35 17 

Women 
High .04 09 .14 .19 24 .14 
EER Low .05 06 .07 .17— .19 ll 
Mean .04 08 .11 .18 .22 12 
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TaB_Le 3. Results of analysis of variance across all subjects. 











Source of Variation df ms F 
Sex 1 6.16 9.49* 
Subjects within Sex 34 .65 
Measures (M) 1 24 
M x Sex 1 .02 
M x Subjects within Sex 34 .68 
SL 4 13.75 74.27* 
SL x Sex 4 52 2.817 
SL x Subjects within Sex 136 19 
M x SL 4 .79 €.06" 
M x SL x Sex 4 03 
M x SL x Subjects within Sex 136 11 








* Significant at or beyond the 1% level of confidence. 


{ Significant at the 5% level of confidence. 


there are no accepted standards by 
which one person’s EEG can be classi- 
fied as high-alpha and another’s as 
low-alpha, the 11 men whose EEG 
showed the most prominent alpha 
rhythm were classified as high-alpha, 
and the 11 whose EEG showed the 
least alpha rhythm were classified as 
low-alpha. The 14 women were simi- 
larly divided. Charan and Goldstein 
(2), who labelled their groups simi- 
larly, used an arbitrary measure of 
alpha prominence as a dividing point 
for the two groups. As a consequence, 
their groups were divided on a 60-40 
basis (high-alpha—low-alpha) com- 
pared to the 50-50 division in the pres- 
ent study. 


Results 


Group Data. The proportions of 
electrophysiologic responses (EDR and 
EER) as a function of the SL of the 
sound are presented in Table 2 for the 
high-alpha and low-alpha subgroups of 
the male and female subjects. 

The statistical design used to evalu- 
ate these data is a three-way-classifica- 
tion analysis of variance. For two 
effects, measures (EDR and EER) and 


sensation levels, measurements were re- 
peated on all subjects; for the third 
effect (either sex or EEG pattern with- 
in sex) measurements were independ- 
ent. In the analysis across all subjects, 
the independent measure in the analysis 
of variance was sex. To determine 
whether or not EEG pattern is related 
to electrophysiologic responsiveness to 
sound, the male and female subjects 
were analyzed separately. Separate 
analyses were necessary since the cri- 
terion upon which the EEG classifica- 
tion was made was not independent of 
the subjects sampled. All proportions 
were transformed to 2 arcsin p'/? (in 
radians) before statistical analysis. 

A major purpose of this study was 
to determine whether sex and EEG 
pattern have any relation to the ease 
with which auditory thresholds can 
be determined from EDR and EER to 
sound. A crucial assumption underlying 
any threshold determination is, of 
course, that the proportion of responses 
increases as the SL of the stimulus is 
raised. In Table 3 it can be seen that 
the differences among levels is signifi- 
cant beyond the 1% level. In addition 
to the required increase in proportion 
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Ficure 2. Electroencephalic responses (EER) 
and electrodermal responses (EDR) as a 
function of level of stimulus. Each point 
represents 432 observations (6 trials for each 
eat of 36 subjects). 

of responses as SL is raised, threshold 
determinations become more precise 
as the function relating proportion of 
responses to SL becomes steeper (8). 
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Ficure 3. Electrophysiologic responsiveness 
as a function of level of stimulus and sex. For 
the male subjects each point represents 528 
observations (2 kinds of responses to each 
of 6 stimuli for each ear of 22 subjects). For 
the female subjects each point represents 336 
observations (2 kinds of responses to each of 
6 stimuli for each ear of 14 subjects). 


For this reason, the interactions of sen- 
sation levels with the other factors (sex, 
EEG pattern, measures) should indicate 
when threshold determinations will be 
affected. For example, the measures-by- 
SL interaction reported in Table 3 is 
significant at the 1% level. When the 
data from Table 2 are plotted to show 
this interaction, as in Figure 2, it can 
be seen that at the control level and 
—5 db SL there is a smaller proportion 
of EDR than EER, but at 0, 5 and 10 
db SL there is a greater proportion of 
EDR than EER. Because the response 
function for EDR rises more sharply 
than for EER (Figure 2), it should be 
easier to determine threshold from the 
EDR than from EER. 

Sex Differences. Two of the effects 
involving sex, as reported in Table 3, 
were significant. Although not directly 
relevant, it should be noted that the 
male subjects gave a_ significantly 
higher proportion of both EDR and 
EER than the female subjects. The sig- 
nificant finding that is particularly 
relevant to this presentation is the sig- 
nificant SL-by-sex interaction. When 
the response functions for male and 
female subjects are plotted separately, 
as in Figure 3, it can be seen that the 
proportion of responses from male sub- 
jects increases more rapidly than from 
female subjects as SL is raised. Again, 
the more steeply rising response func- 
tion for male subjects implies that it 
should be easier to determine thresholds 
for male than for female subjects. 

EEG Pattern. To determine if EEG 
pattern is related to the outcome of 
the procedure for threshold determina- 
tion, it was necessary to analyze the 
data for male and female subjects sep- 
arately for the reason previously given. 
The results of these analyses are sum- 
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TaBLE 4. Results of analyses of variance for female and male subjects in which EEG pattern is used 


as a variable. 











Females Males 
Source of Variation df ms F df ms F 

EEG Group 1 13 1 .70 
Subjects within EEG Groups 14 24 20 .90 
Measures (M) 1 04 1 .22 
M x EEG Group 1 .02 1 6.73 17.56* 
M x Subjects within EEG Groups 14 .62 20 .38 
SL 4 3.21 25.54* 4 11.06 57.86* 
SL x EEG Group 4 .03 4 .68 3.57* 
SL x Subjects within EEG Groups 56 13 80 19 
M x SL 4 45 3.74F 4 36 4.21* 
M x SL x EEG Group 4 10 4 .39 4.55* 
M x SL x Subjects within EEG Groups 56 .12 80 .09 








* Significant at or beyond the 1% level. 

{ Significant at the 5% level. 
marized in Table 4. For the females 
it can be seen that none of the sources 
of variation involving EEG pattern are 
significant. The results of the analysis 
for females alone are in no way differ- 
ent from the results across all subjects 
when EEG pattern is ignored. 

For the male subjects, however, EEG 
pattern seems to be related to the out- 
come of the threshold determination 
from the EDR-Audio and the EER- 
Audio, in that every interaction in 
Table 4 which involves EEG pattern 
is significant at the 1% level. The first 
of these significant interactions, meas- 
ures-by-EEG pattern, is not related 
directly to threshold determination, 
since it does not involve sensation 
levels. The source of this interaction 
is that the low-alpha males give a sig- 
nificantly larger proportion of EDR 
and a significantly smaller proportion 
of EER than the high-alpha males (see 
Table 2). 

The other two significant interac- 
tions found in the case of the men, SL- 
by-EEG pattern and measures-by-SL- 
by-EEG pattern, do involve sensation 
level and are, therefore, directly re- 
lated to threshold determination. These 
interactions are probably related to the 


rate with which EDR increases as a 
function of SL for the low-alpha men. 
This rate is much more rapid than for 
EDR in high-alpha men and for EER 
in both groups of men (see Table 2). 
These results imply that threshold 
determination should be easiest for 
low-alpha men, using EDR as an indi- 
cator of hearing. 


Individual Thresholds. It was neces- 
sary to establish a criterion for thresh- 
old that would permit consistent and 
meaningful estimations of thresholds. 
An examination of some of the assump- 
tions underlying conventional audio- 
metric procedures offers some insight 
into the problem of selecting a criterion 
for threshold. 

Certain of the assumptions underly- 
ing the psychophysical methods used 
for determination of threshold that are 
pertinent to the problem of selecting 
a criterion for threshold in the proce- 
dure used here are (a) percentage of 
responses to stimuli is expected to in- 
crease from 0% to 100% as the inten- 
sity of the stimuli is increased through 
a narrow range of intensities, and (b) 
the curve describing the increased per- 
centage of response as a function of 








sensation level (response-by-SL func- 
tion) is expected to be a normal ogive, 
so that the mean and median (50% 
response) coincide with the midpoint 
of the range of intensities. 

These assumptions may not be ten- 
able for the present situation. The 0% 
end of this response continuum may be 
obscured by random changes in the 
state of the response-system which may 
coincide with a stimulus-mark and be 
interpreted as a response to a particular 
auditory stimulus. In addition, errors in 
judgment may lead the person who is 
analyzing the records to consider some 
irrelevant change as a response to a 
stimulus. Similar difficulties are en- 
countered in psychophysical proce- 
dures but not to the extent that they are 
encountered in the electrophysiologic 
studies on this same population of nor- 
mal adults. The control-interval was 
introduced into the electrophysiologic 
procedure to permit evaluation of the 
frequency of spurious responses and 
of errors in analysis in order to estab- 


lish the 0%-end of the response-by-SL 


function. 

At the other extreme, the phenom- 
enon of adaptation reduces the proba- 
bility of a response-by-SL function that 
increases to 100% in the desired fash- 
ion. The procedure used here, although 
it tended to reduce adaptation, was not 
designed to elicit 100% responses from 
each subject. 

The 0%-end of the response-by-SL 
functions obtained in this study is the 
only aspect corresponding to the psy- 
chophysical function that can be es- 
tablished with any degree of certainty. 
The criterion for threshold adopted, 
therefore, was that threshold for an 
ear would be defined as the lowest in- 
tensity-level at which significantly 
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greater proportion of responses occur 
than occur in the control-intervals. 

Most statistical tests of the difference 
between proportions cannot be recom- 
mended for use with proportions that 
are either extremely large or extremely 
small, but conversion of all of the pro- 
portions involved to 2 arcsin p¥/? (in 
radians), as recommended by Walker 
and Lev (11), produces a variable 
whose variance is independent of the 
magnitude of the proportion. Walker 
and Lev give formulae for calculating 
the arcsin for proportions of 0.00 and 
1.00, and a table for the intermediate 
values. The variance of this transform- 
ed variable has the value of 1/7 as its 
upper limit, where 7 equals the number 
of cases upon which the proportion 
is based. For proportions approximat- 
ing zero (that is, the proportion of re- 
sponses in the control-intervals), the 
number (7) of observations upon which 
the proportion is based should be 
greater than 10. When 1/n is used as 
the variance of the transformed vari- 
able, it is possible to test the significance 
of difference between two arcsins. This 
procedure does not violate the assump- 
tion of independence underlying the 
use of this statistic, since a random 
sample of the behavior of an individual 
is being evaluated. 

The procedure outlined above was 
followed, and the standard error of the 
difference between the arcsin for the 
control situation (7 = 12) and that 
for any level for any ear (77 = 6) was 
calculated. Since the hypothesis being 
tested was that significantly more than 
zero responses occurred, a one-tail test 
was used. The 5% point was used as a 
critical value. On this basis when no 
response occurs in the control-inter- 
vals (0/12), two responses to a par 
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TaBLE 5. Thresholds successfully estimated from EDR-Audio and EER-Audio (db Sensation Level 


re behavioral thresholds). 








Male 
High-Alpha 
Right 





_ Low-Alpha High-Alpha Low-Alpha 
Left Right Left Right Left Right Left 
(EDR Audio) 
10 10 -5 5 5 10 
10 5 0 -5 0 
5 5 0 5 
5 0 5 0 
5 0 0 -5 0 -5 
10 5 0 5 0 
10 10 0 5 10 
5 5 0 
10 5 
5 5 0 0 
-5 -5 5 0 
(EER Audio) 
-5 -5 10 10 10 
5 5 10 5 10 0 
0 0 0 5 10 5 
0 5 10 5 10 
-5 5 5 5 
10 10 
-5 10 5 5 
0 0 
10 
5 10 
0 








ticular stimulus (2/6) is the least num- 
ber of responses that is significantly 
different from the controls; when one 
response occurs in the 12 control-in- 
tervals (1/12), three responses to a 
particular stimulus (3/6) is the least 
number of responses that is significant- 
ly different from the controls; and when 
two or three (2/12, 3/12) responses 
occur in the control-intervals, four re- 
sponses to a particular stimulus (4/6) is 
the least number of responses that is 
significantly different from the con- 
trols. 

For each subject four response-by- 
SL functions were evaluated in terms of 
this criterion for threshold: EDR and 
EER for each ear. Evaluations of these 
functions are presented in Table 5. Es- 
timation of threshold was considered 
successful if the above criterion was 
met at any sensation level from —5 db 
to +10 db, inclusive. 

The measures-by-SL and SL-by-sex 


interactions found in the analysis 
across all subjects are explained by the 
fact that the response-by-level func- 
tion for EDR rises more rapidly than 
for EER (Figure 2) and that the re- 
sponse-by-level function for male sub- 
jects rises more rapidly than for female 
subjects (Figure 3). These results im- 
ply that it is easier to obtain thresholds 
from EDR than from EER, and that 
it is easier to obtain thresholds for the 
male subjects than for the female sub- 
jects. The steeper slope (h) in each 
case (EDR and male subjects) implies 
a smaller standard deviation 
(h? = ¥% o”) (8), and a smaller stand- 
ard deviation permits more precise 
estimates. These implications can be 
validated or rejected by testing the dif- 
ferences between the proportion of sub- 
jects for whom thresholds were ob- 
tained and the difference between the 
proportion of thresholds successfully 
estimated from. the EDR-Audio and 
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EER-Audio. The resultant hypotheses 
include direction, that is, proportion of 
males greater than the proportion of 
females, and proportion of successful 
thresholds by EDR-Audio greater than 
the proportion of successful thresholds 
by the EER-Audio. A one-tail test was 
therefore used to check these hypoth- 
eses. 

In order to meet the assumption of 
independence involved in testing the 
significance of difference between two 
proportions (3), it was necessary to 
include the subjects for whom a thresh- 
old was obtained on only one ear either 
with the successes (thresholds for both 
ears) or with the failures (threshold 
for neither ear). The more conserva- 
tive position was adopted here and 
subjects for whom threshold was ob- 
tained for only one ear were included 
with the failures. 

The comparison between the pro- 
portion of successful threshold deter- 
minations among male and female sub- 
jects was as separately for EDR- 
Audio and EER-Audio. A test for the 
difference between uncorrelated pro- 
portions, corrected for continuity, was 
used (11, p. 429). In the case of the 
EDR-Audio, the CR is 1.73 and in the 
case of the EER-Audio, the CR is 2.80. 
These critical ratios are beyond the 
5% critical point of a one-tail test 
(CR = 1.65). They indicate that a 
greater proportion of succesful thresh- 
old determinations was obtained for 
male than for female subjects with 
both EDR-Audio and EER-Audio. 

A test of correlated proportions (3) 
was used to test the significance of dif- 
ference between the proportion of 
thresholds from the EDR-Audio and 
from the EER-Audio. Again the correc- 
tion for continuity was used. A CR 


of 1.66 was obtained. This value is 
beyond the 5% critical point for a 
one-tail test (CR = 1.65). It indicates 
that a greater proportion of successful 
threshold determinations was obtained 
with EDR-Audio than with EER- 
Audio. 

The implication, based on the anal- 
ysis of male subjects (Table 4), that 
the EDR-Audio for low-alpha men 
would result in a significantly greater 
proportion of successful threshold de- 
terminations than EDR-Audio for high- 
alpha men, or EER-Audio for either 
group, was tested in the same manner 
as the implication from the overall anal- 
ysis (Table 3). In the case of male 
subjects alone, none of the implications 
based on the group data were support- 
ed by the analysis of the proportion of 
successful threshold determinations. 


Discussion 


The relations of sex and EEG-pat- 
terns to electrodermal responsiveness 
found in this study tend to validate 
similar findings reported previously by 
Charan and Goldstein (2). The cer- 
tainty of these relations is strengthened 
by the fact that different procedures 
and different subjects were used in 
each study. It is not known whether 
these relations exist also in children. 

Data on the estimation of individual 
threshold bear out some of the infer- 
ences from the group data: (a) it is 
easier to obtain thresholds for men 
than for women, and (b) it is easier to 
obtain thresholds from the EDR-Audio 
than from the EER-Audio. These find- 
ings, however, should not be construed 
to imply that the subjects for whom 
no thresholds were obtained are im- 
possible to test. Modifications in pro- 
cedure such as a preliminary condi- 
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tioning session, more frequent rein- 
forcement (and not only at 10 and 5 
db SL) and increase in the intensity of 
the shock throughout the test undoubt- 
edly would have produced more suc- 
cess. The successful tests (about 50%) 
give sufficient indication, however, of 
the feasibility of objective confirma- 
tion of thresholds hy electrophysiologic 
techniques within 5 or 10 db of thresh- 
olds previously determined from verbal 
responses. 

The criterion for threshold estab- 
lished in this paper also provides a basis 
for objective determination of thresh- 
old when no prior information about a 
patient’s auditory sensitivity is avail- 
able. A concurrent study in which this 
criterion was applied to the testing of 
children has already been reported 
(12). 

Summary 

Electrodermograms and _ electroen- 
cephalograms were recorded simulta- 
neously from adults with normal hear- 
ing while tones of 1000 cps at —5, 0, 
+5 and +10 db sensation level were 
presented in random order in either 
ear. The proportion of electrodermal 
responses (EDR) and electroencephalic 
responses (EER) increased as a func- 
tion of intensity, the EDR at a greater 
rate than the EER. The male subjects 
with a dominant alpha rhythm in their 
electroencephalograms gave more EER 
than EDR; the male subjects whose 
electroencephalograms contained little 
or no alpha rhythm gave more EDR 
than EER. The female subjects re- 
sponded less often than the male sub- 
jects and the percentage of EDR or 
EER they gave was not related to their 
EEG patterns. 

A criterion was developed for esti- 
mation of threshold for individual ears 


on the basis of percentage of responses 
as a function of intensity. Estimation 
of threshold was successful more often 
for the male than for the female sub- 
jects and more often from the EDR 
than from the EER. 


References 

1. Berry, J. L., and Martin, B., GSR reac- 
tivity as a function of anxiety, instruc- 
tions, and sex. J. abnorm. (soc.) Psychol., 
54, 1957, 9-12. 

2. CHaran, K. K., and Gotpstern, R., Re- 
lation between EEG pattern and ease 
of eliciting electrodermal responses. J. 
Speech Hearing Dis., 22, 1957, 651-661. 

3. Epwarps, A. L., Experimental Design 
in Psychological Research. New York: 
Rinehart and Co., 1950. 

4. Epwarps, R. E., Magnitude of the gal- 
vanic skin response as a function of 
auditory stimulus intensity. Doctoral 
Dissertation, University of Washington, 
1952. 

5. Goxpstern, R., and Dersysuire, A. J., 
Suggestions for terms applied to electro- 
physiologic tests of hearing. J. Speech 
Hearing Dis., 22, 1957, 696-697. 

6. Goxpstrein, R., Lupwic, H., and Naun- 
ton, R. F., Difficulty in conditioning 
galvanic skin responses: its possible sig- 
nificance in clinical audiometry. Acta 
Oto-laryng., 44, 1954, 67-77. 

7. Grant, D. A., and ScHNEmDER, DorotHy 
E., Intensity of the conditioned stimulus 
and strength of conditioning: II. The 
conditioned galvanic skin response to an 
auditory stimulus. J. exp. Psychol., 39, 
1949, 35-40. 

8. Guttrorp, Joy P., Psychometric Methods. 
New York: McGraw-Hill Book Co., 
1936. 

9. Hovtanp, C. I., The generalization of 
conditioned responses: II. The sensory 
generalization of conditioned responses 
with varying intensities of tone. J. genet. 
Psychol., 51, 1937, 279-291. 

10. Hovianp, C. I., and Rirsen, A. H., Mag- 
nitude of galvanic and vasomotor re- 
sponses as a function of stimulus intensity. 
J. gen. Psychol., 23, 1940, 103-121. 

11. Waker, Heten M., and Lev, J., Sta- 
tistical Inference. New York: Henry 
Holt and Co., 1953. 

12. Wirnrow, F. B., Jr., and Gotpstern, R., 
An electrophysiologic procedure for de- 
termination of auditory threshold in 
children. Laryngoscope, 48, 1958, 1674- 
1699. : 





Nasality In Isolated Vowels And Connected 
Speech Of Cleft Palate Speakers 


DUANE C. SPRIESTERSBACH 


GENE R. POWERS 


Articulation is currently receiving 
prominent attention in the diagnostic, 
therapeutic and research, approaches to 
the speech of cleft palate speakers. Ex- 
cessive nasality is also recognized as 
one of the typical characteristics of 
cleft palate speech. It may be the most 
deviant aspect of the speech of some 
cleft palate speakers and it seems rea- 
sonable to assume that the nature of 
the therapy techniques may be de- 
termined, in part, by the presence or 
absence of excessive nasality. 

Many of the techniques used for 
the assessment of the nasality in speech 
are based upon the assumption that 
there is a close relationship between 
nasality in isolated vowels and con- 
nected speech. As a consequence, the 
nasality in the connected speech of a 
speaker is indirectly assessed by judg- 
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ing the nasality in selected isolated 
vowels, sometimes only one or two 
vowels. The validity of this assumption 
depends also upon the further as- 
sumption that the degree of nasality for 
a given speaker is relatively constant 
from one vowel phonation to another 
and from isolated vowel phonations to 
connected speech. In addition, many 
diagnostic procedures require the as- 
sumption that valid judgments of the 
nasality of connected speech can be 
made even though the sample may con- 
tain misarticulations and other devi- 


ant speech characteristics which di- 


minish its intelligibility and interfere 
with communication. 

Van Hattum (9) has already studied 
the relationships under discussion. He 
reported a statistically significant but 
low correlation of .48 between the 
judgments of nasality in isolated vowels 
and connected speech samples of 20 
cleft palate speakers. He also found 
that the front vowels of his speakers 
were judged to be more nasal than the 
back vowels and that no systematic 
differences existed between the high 
and low vowels. This latter finding 
is difficult to interpret in view of the 
findings of Kelly (5), Hixon (2), Kalt- 
enborn (4), House and Stevens (3) 
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and Hess (1) which suggest that there 
is a basis for a systematic relationship 
between the height of tongue place- 
ment and the degree of perceived 
nasality of vowels. 

The present study is a further at- 
tempt to evaluate the rela‘‘onship be- 
tween connected speech and isolated 
vowels on perceived nasality and to 
evaluate such differences as may exist 
among selected vowels. 


Subjects and Procedure 


The subjects for this study consisted 
of 50 children, 14 girls and 36 boys, 
with cleft lips and palates or cleft 
palates only, ranging from five to 15 
years of age. Eleven had cleft palates 
only and 39 had both cleft lips and 
cleft palates. The palatal clefts of 36 
children had been closed surgically, 10 
were fitted with obturators and four 
had open palatal clefts. 

High-fidelity tape recordings were 
made of each speaker sustaining seven 
vowels: [i], [e], [a], [ze], [o], [ul] 
and [a]. These vowels were selected 
to be representative of the various 
tongue placements for vowel produc- 
tions and included those most often 
suggested for use in identifying and 
describing nasality. Recordings were 
made also of the conversational speech 
of each subject. 

Two tapes were prepared, one for 
presenting the vowels and one for 
presenting conversational speech seg- 
ments 30 seconds long. The order of 
presentation was randomized for each 
tape. The tape with the conversational 
speech samples was played backwards 
to provide for minimizing the influ- 
ence on the nasality judgments of 
such irrelevant factors as defective 


articulation and poor intelligibility. 
Sherman (6) has demonstrated that 
more valid judgments of nasality of 
non-cleft speakers can be made if the 
speech samples are presented back- 
ward. Furthermore, she found that the 
backward-play method compared fa- 
vorably to the conventional forward 
presentation with respect to reliability. 
Spriestersbach (7), using the back- 
ward-play method in judging the nasal- 
ity in the speech of cleft palate speak- 
ers, has reported: similar findings. 

The samples were rated by 30 ad- 
vanced students in speech pathology 
who had had training and experience 
in diagnosing nasal voice quality. 
There were two experimental sessions, 
one for judging the nasality of the 
isolated vowels and the other for judg- 
ing the nasality in the connected speech 
samples. The listeners were asked to 
judge the severity of the nasality on 
a seven-point, equal-appearing intervals 
scale with one representing least severe 
nasal voice quality and seven represent- 
ing most severe nasal voice quality. 
For the purpose of checking reliability, 
25 of the vowels and 20 of the con- 
nected speech samples were repeated 
and judged a second time. 


Results 


Median scale values and Q values 
were computed for each vowel and 
for each connected speech sample in 
the manner described by Thurstone 
and Chave (8). The mean Q values 
were 1.07 for the vowels and .92 for 
the connected speech samples. 

For the purpose of evaluating re- 
liability, correlation coefficients meas- 
uring the strength of relationship be- 
tween two sets of scale values for 25 
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vowels and also for 20 connected 
speech samples were obtained: .81 for 
the vowels and .97 for the connected 
speech samples. The scale values for 
connected speech were thus more re- 
liable than the scale values for vowels, 
as evaluated both by the reliability co- 
efficients and by the Q values. 


Taste 1. Mean scale values of nasality of is- 
olated vowels and correlation coefficients estimat- 
ing strength of relationship between scale values 
of nasality of connected speech and of isolated 
vowels for children with cleft palates. 











Vowel Number* Means rt 
li] 50 4.65 AT 
le] 32 4.51 .60 
[a] 46 3.25 .50 
[ex] 45 4.01 .60 
{o] 50 4.05 .58 
{ul 49 4.63 .60 
[a] 48 3.78 56 
Vowel mean 50 .70 








*Number of children producing an acceptable 
phoneme varied. 
fAll rs significant beyond 1% level. 


Correlation coefficients computed be- 
tween the sets of median scale values 
for each of the isolated vowels and 
the median scale values of connected 
speech samples are reported in Table 
1. It will be noted that all seven of 
these correlations proved to be sig- 
nificantly greater than zero beyond the 
1% level. However, the correlations 
are not high, ranging from .47 for [i] 
to .60 for [e], [o] and [a]. One fur- 
ther correlation was computed between 
the median scale values of the con- 
nected speech samples and a set of 
means obtained by averaging the me- 
dian scale values of the vowels. The 
obtained correlation coefficient of .70 
is considerably higher than that found 
for any single vowel and the con- 
nected speech sample. Also given in 
Table 1 are the mean scale values for 


Tasie 2. Summary of analysis of variance test- 
ing differences among mean scale values of severity 
of nasality obtained for six vowels produced by 
40 children with cleft palates. 











Source df ms F* 
Vowels (V) 5 12.50 21.55 
Subjects (S) 39 4.20 
VS 195 .58 
Total 239 








*F o5(5 and 150 df) = 2.27. 


each vowel. Examination of the means 
and corresponding Fs shows that the 
relative severity of nasality among the 
vowels is not related to the degree of 
correlation between the scale values 
of the. vowels and connected speech. 
For example, the correlations between 
the scale values for the two vowels 
judged as most nasal, [i] and [u], and 
connected speech are at opposite ends 
of the distribution. 

A treatments-by-subjects analysis of 
variance was used to evaluate differ- 
ences among vowels on the severity 
measures. The vowel [e] was omitted 


‘from this analysis because of the num- 


ber of phonemically inaccurate pro- 
ductions. The remaining vowels were 
produced by 40 of the subjects with 
satisfactory phonemic accuracy. Dif- 
ferences among vowels were highly 
significant according to the results of 
the F test reported in Table 2. 





ti) 465 tu) 4.63 
te) 451 fo] 4.05 
fa] 3.78 
{z]) 4.01 (a) 3.25 
front middle back 


Ficure 1. Mean scale values of severity of 
nasality for isolated vowels, arranged accord- 
ing to the tongue placement. Subjects were 
children with cleft palates. See Table 1 for Ns. 
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Taste 3. Median scale values of nasality of isolated vowels for selected subjects. 











Subject [i] [e] (a) ] [o] ful [A] Range 
A 6.12 6.21 6.06 5.50 5.94 5.50 71 
B 3.10 3.88 3.51 3.17 3.12 2.86 1.02 
C 6.06 2.17 2.7 2.70 3.93 1.27 4.79 
D 4.70 4.30 3.88 5.10 3.70 5.21 1.51 
E 6.06 3.67 2.50 2.32 3.12 4.50 3.23 3.74 








Tongue height appears to be the 
most important variable related to per- 
ception of nasality. To provide for 
comparisons of obtained means in re- 
lation to tongue positions, the means 
are shown in Figure 1 along with a 
vowel diagram. The high vowels were 
judged as significantly? more nasal than 
the low vowels. With the exception of 
the vowels with the highest tongue 
placement, [i] and [u], the front 
vowels were judged as more nasal than 
the back vowels. For those vowels in 
which tongue height is fairly compa- 
rable, the only significant difference 
was between [z] and [a]. The differ- 
ence between [e]| and [0], however, 
was not tested because of the pre- 
viously mentioned lack of phonemic 
accuracy for [e]. 

The findings of the present study are 
in very close agreement with a similar 
analysis by Hess (1). In fact, the sta- 
tistically significant differences among 
vowels common to the two _ studies 
are the same. Thus, the Hess study, as 
well as the present study, indicates that 
the high vowels of cleft palate speakers 
are perceived as more nasal than the 
low vowels. In this respect, the findings 
of these two studies differ from those 
of Van Hattum (9). All three studies, 
however, indicate that the front vowels 
of cleft palate speakers are judged to 


*C.D. 


=t 95 (2msyg/s)*”=.33 
(critical difference) 


be more nasal than the back vowels. 
Several marked variations from 
group trends occurred for the individ- 
ual subjects. Each of the seven vowels, 
for instance, was judged as the least 
nasal vowel for at least one subject. 
Each vowel also was judged as the 
most nasal vowel for at least one sub- 
ject. The range of severity and the 
consistency of the nasality scale values 
for the vowels varied considerably 
among subjects. Table 3 presents some 
examples which illustrate variations. 
For some of the subjects the scale 
values were very similar for all of the 
vowels. The range for Subject A, for 
example, was only .71 of a scale value. 
Such consistency is found not only for 
individuals with the relatively extreme 
nasality scale values of Subject A. Sub- 
ject B also shows remarkable consis- 
tency although his scale values fall 
slightly below the middle of the scale. 
Some of the subjects received markedly 
different ratings for the various vowels. 
The scale value for Subject C’s most 
nasal vowel, [i], was nearly five points 
higher than the value for his least nasal 
vowel, [a]. In the case of Subject D, 
the vowel [ul], although it was one of 
the most nasal for the group, was rated 
the least nasal, and the vowel [a], one 
of the least nasal for the group, was 
rated as most nasal. The vowels [i] 
and [u] produced by Subject E were 
rated as being relatively nasal in com- 
parison with the other vowels. 
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Discussion 


While all of the correlation coef- 
ficients computed for the purpose of 
estimating the strength of the relation- 
ship between scale values of nasality in 
isolated vowels and in connected 
speech were significant, the coefficients 


were not very high. It follows, as Van | 


Hattum (9) has suggested, that the pre- 
diction of nasality in connected speech 
from the nasality observed in a single 
isolated vowel is questionable. Since 
a higher correlation was obtained be- 
tween the set of means obtained by av- 
eraging over vowels and the scale 
values of the connected speech samples, 
it would seem that the use of nasality 
ratings of several vowels in predicting 
the relative amount of nasality ia con- 
nected speech would be a more use- 
ful procedure. It should be recognized 
however, that the lack of higher cor- 
relations between the nasality scale 
values of isolated vowels and connected 
speech might have been due in part to 
the less reliable ratings for isolated 
vowels. 

There is considerable evidence that 
non-cleft individuals typically employ 
a more precise velopharyngeal closure 
when producing the high vowels than 
when producing the low vowels. Fur- 
thermore, House and Stevens’ (3) find- 
ings indicate that a relatively smaller 
degree of coupling of the nasal cavity 
to the oral cavity is required to per- 
ceive nasality in the high vowels than 
is the case for the low vowels. 

The degree to which the velo- 
pharyngeal valve is faulty among a 
Aroup of cleft palate speakers can be 
~ expected to vary considerably. There- 
fore, those speakers who exhibit a 
moderately faulty valving would be 
expected to produce high vowels which 


would be judged more nasal than the 
low vowels. Those speakers with an 
extremely poor velopharyngeal valve 
might produce high and low vowels 
which would be perceived as equally 
nasal. The net result would be that cleft 
palate speakers, considered as a group, 
would exhibit more nasality on the 


“high vowels than on the low vowels 


even though the opposite tends to be 
true among normal speakers. 

The explanation for the finding of 
more nasality among the front vowels 
than among the back vowels is not 
so apparent. Perhaps the constriction 
of the oral cavity for the front vowels 
covers a larger area and thus creates 
a greater impedance for the flow of 
air through the oral cavity. House and 
Stevens’ (3) data for the absolute test 
of perceived nasality also indicate that, 
without consideration of tongue height, 
the front vowels require a slightly 
smaller area of nasal coupling to be 
judged nasal than do the back vowels. 


It may be assumed that measures of 


‘a speaker’s ability to develop oral pres- 


sure can be used as an index of the 
adequacy of velopharyngeal valving. In 


Tass 4. Correlation coefficients estimating the 
strength of relationship between ratios of vital 
capacity (measure with nostrils open divided by 
measure with nostrils occluded) and scale values 
of severity of nasality in samples of connected 
speech from children with cleft palates. 











Vowel Number r 
{il 50 -.38* 
{u] 49 ~.43* 
{el 32 -.19 
[o] 50 -.26 
[ee] 45 -.06 
[a] 48 -.45* 
[a] 46 -.28 
Connected Speech 50 —.33f 








*Significant beyond 1% level. 
{Significant beyond 5% level. 
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the present study vital capacity meas- 
ures both with the nostrils open and 
with the nostrils occluded were used 
for this purpose. A ratio was computed 
between the two measures for each 
subject. If the two measures were equal, 
a ratio of one was assigned. If the 
measurement with the nostrils occluded 
exceeded. that with the nostrils open, 
a ratio of less than one was assigned. 
Correlation coefficients were computed 
between the vital capacity ratios and 
the median scale values of severity of 
nasality for each isolated vowel and 
also for connected speech samples. The 
correlations are shown in Table 4. The 
obtained correlation coefficients com- 
puted for the connected speech samples 
and for each of the high vowels are 
statistically significant. As has already 
been pointed out, the relative degree 
of nasality in the high vowels of cleft 
palate speakers would be expected to 
be influenced most directly by the ade- 
quacy of velopharyngeal closure. The 
importance of the relationship between 
the adequacy of closure and the degree 
of nasality is emphasized by the fact 
that significant correlations were ob- 
tained despite the crudeness of the vital 
capacity measures, the multiple factors 
involved in producing nasal speech, in- 
dividual variability and the vagueness 
of the nasality construct. 


Summary 


The purpose was to study nasality 
of a group of cleft palate speakers. Re- 
cordings were made of seven isolated 
vowels and of connected speech pro- 
duced by 50 children with cleft palates. 
These recordings were scaled for se- 
verity of nasality by 30 judges. The 
connected speech samples were pres- 


sented backward to minimize contami- / 
nation of the judgments by irrelevant 
factors such as misarticulations. 


According to obtained results, se- 
verity of nasality in connected speech 
is related to severity of nasality for 
each isolated vowel studied. The cor- 
relation coefficients ranged from .47 to 
60. For cleft palate speakers high 
vowels are, in general, more nasal than 
low vowels. For vowels with fairly 
comparable tongue height, front vowels 
are more nasal than back vowels. 
Trends for some individuals deviated 
markedly from those for the group. 
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Pitch And Duration Characteristics 
Of Older Males 


EDWARD D. MYSAK 


A noteworthy series of investiga- 
tions dating back to 1939 has provided 
data on various male vocal attributes 
observed at different developmental 
stages. The pitch attribute in particu- 
lar has been analyzed for the infant 
(6), the child (8), the boy in various 
stages of adolescence (4) and the col- 
lege-age man (10, 17, 21). Data for the 
time-attribute also have been collected 
for several of these age groups by the 
same investigators. 

The analysis by developmental stages, 
however, has not at present progressed 
beyond the young adult; and the up- 
ward revision of mortality tables, re- 
flecting expanding middle-age and 
advanced-age populations, renders these 
later stages significant areas for vocal 
investigation. Hence, the decision was 
made to study two groups of elder 
males, as well as the middle-aged sons 
of many of the senior subjects. This 
design, in addition to its tentative nor- 
mative aspects, would make testing pos- 
sible for (a) differences between elder 
groups separated on an arbitrarily se- 
lected age criterion and (b) family re- 
lationships in measurable speech at- 
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tributes for elder fathers and their mid- 
dle-aged sons. 

The variables under investigation for 
the three groups in both oral-reading 
and impromptu-speaking performances 
were: (a) mean and median fundamen- 
tal vocal frequency (pitch) level; (b) 
total range (distance in semitones be- 
tween the lowest and highest pitches 
used); (c) functional range (the range 
in semitones between the Sth and 
95th percentiles of the distribution of 
vocal frequencies used); (d) standard 
deviation of the individual’s distribu- 
tion of vocal frequencies; (€) words 
per minute; and (f) phonation/time 
ratio. 

’ The major questions to be answered 

were: (a) What are the central tend- 
ency data for the seven criterion vari- 
ables under the two speaking condi- 
tions for the three groups of male sub- 
jects? (b) Are there any statistically 
significant differences between the 
means of the two elder groups for any 
of the seven criterion variables? (c) 
Are there any family relationships 
manifested when specific voice varia- 
bles of a group of elder fathers are.com- 
pared with specific voice variables of 
their respective middle-aged sons? 


Procedure 


Subjects. The subjects were from 
four institutions for the aged (mean 
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length of enrollment, 6.1 years) and 
also from private homes in Lafayette, 
Indiana; the middle-aged sample repre- 
sented the sons of various fathers with- 
in the elder groups. To serve as a sub- 
ject each individual was required to 
qualify as a reader and to be free of 
any serious physical, auditory or 
speech incapacity. The physical and 
auditory capacities of the elder individ- 
uals were required to be within normal 
limits for their age groups as described 
by various investigators (9, 13, 15, 22). 
This selection process resulted in a 
sample of 12 individuals in the age 
group from 65 to 79 years, with a mean 
age of 73.3 years (henceforth identi- 
fied as elder group I); 12 individuals 
in the age group from 80 to 92 years, 
with a mean age of 85.0 years (hence- 
forth identified as elder group II); and 
15 sons ranging in age from 32 to 62 
years, with a mean age of 47.9 years. 
Individual subjects were natives of 
at least nine Indiana cities, ranged in 
education from a few years in elemen- 
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tary school to a B.S. degree in engi- 
neering, and represented at least 23 
types of semi-skilled, skilled and pro- 
fessional vocations. 


Speech Samples and Instrumentation. 
Each subject was required to record 
the 98-word first paragraph of the 
Rainbow Passage (7), which he had 
previously practiced, as well as a sample 
of impromptu speech. The topic, which 
was identical for all subjects, was, 
‘What I Like To Do Most In The 
Summertime.’ ‘The recordings were 
made on an Ampex, Model 400, tape 
recorder using an Altec Lansing M11 
microphone system with an Altec 21B 
miniature condenser microphone. The 
instrument used for the fundamental 
frequency analysis and phonation/time 
ratio was a modification of the Funda- 
mental Frequency Recorder  con- 
structed by Dempsey (5) at Purdue 
University; the innovation was devel- 
oped by Cummins (3) and was called 
a Comparator-Counter Attachment for 
the Fundamental Frequency Recorder. 


Taste 1. Comparison of the pitch and duration data for the three groups during oral reading and 


impromptu speaking. 











Middle-Aged Males Elder Group I Elder Group II 
Type of Speech Oral Impromptu Oral Impromptu Oral Impromptu 
Reading Speaking Reading Speaking Reading Speaking 
Mean Fundamental Pitch » 
(semitones above OF L) 33.4 33.0 34.9 34.4 37.2 36.7 
Mean Fundamental rs 
Frequency (cps) 113.2 110.7 124.3 120.2 141.0 136.5 
Median Fundamental 
Pitch 
(semitones above OFL) 32.9 32.6 35.0 34.3 37.4 36.7 
Median Fundamental 
Frequency (cps) 110.3 107.9 124.9 120.1 142.6 137.1 
Total Range (semitones) 16.9 16.6 EZ. 17.0 19.6 19.4 
Functional Range 
(semitones) 9.5 9.4 9.9 9.6 11:3 11.4 
Standard Deviation 
(semitones) 2.9 2.0 3.0 2.8 $3 3.4 
Phonation/Time Ratio 56 46 52 3 18 48 
Words per Minute 122.4 
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The words-per-minute analysis re- 
quired only a stop watch for instru- 
mentation. 


Reliability. Reliability for the Count- 
er Attachment and the recording and 
analysis procedures was high; the mean 
error factor in all cases was between 1 


and 2%. 


Results 


Normative Data. The results of the 
investigation with reference to average 
data for the three groups under the two 
speaking conditions are summarized in 


Table 1. 


Inspection of the results reveals a 
rise in median fundamental vocal fre- 
quency level as a function of age: from 
the middle-aged males’ fundamental 
frequency of 110.3 cps or 32.9 semi- 
tones above the zero frequency level 
(OFL) of 16.35 cps, to elder group I’s 
frequency of 124.9 cps or 35.0 semi- 
tones above OFL, to elder group II's 
level of 142.6 cps or 37.4 semitones 
above OFL. It should be noted that 
oral reading was seen to be character- 
ized by a slightly but consistently 
higher level than impromptu speaking. 

In general, measures indicative of 
pitch flexibility—total range, functional 


Taste 2. Analyses of variance testing for differences between elder groups with reference to the seven 


criterion vocal variables. 











Source df ms F 
Mean Fundamental Pitch 
Between Subjects 
Group 1 63.48 9.59* 
Subject in Group 22 6.82 
Within Subjects 
Method 1 3.97 5.20f 
Group x Method 1 0 <I 
Method x Subject in Group i .76 
Total 42 
Median Fundamental Pitch 
Between Subjects 
Group 1 70.81 8.13* 
Subject in Group 22 8.71 
Within Subjects 
Method 1 5.27 5.65f 
Group x Method 1 .02 <a 
Method x Subject in Group 17* 93 
Total 42 
Total Range 
Between Subjects 
Group 1 55.69 BY ie 
Subject in Group 22 6.07 
Within Subjects 
Method ] 2.76 2.68 
Group x Method 1 .73 <1 
Method x Subject in Group 1 foal 1.03 
Total 42 
Functional Range 
Between Subjects 
Group iL 30.10 3.04 
Subject in Group 22 9.89 
Within Subjects 
Method 1 04 <i 
Group x Method 1 49 <1 
Method x Subject in Group i fea 2.66 


Total 


42 
(Taste 2 continued on following page.) 
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Tass 2. Continued from preceding page. 











Source df ms F 
Standard Deviation of Vocal Frequencies Used 
Between Subjects 
Group 1 2.56 3.10 
Subject in Group 22 .83 
Within Subjects 
Method 1 -02 <j 
Group x Method 1 .14 <i 
Method x Subject in Group ti" 25 
Total 42 
Phonation/Time Ratio 
Between Subjects 
Group 1 .00 <i 
Subject in Group 22 01 
Within Subjects 
Method 1 .02 3.15 
Group x Method 1 .02 3.38 
Method x Subject in Group Ly ty .O1 
Total 42 
Words per Minute 
Between Subjects 
Group 1 1559.52 1.55 
Subject in Group 22 1007.74 
Within Subjects 
Method 1 32.34 <i 
Group x Method 1 112.24 <} 
Method x Subject in Group 17** 1305.23 
Total 42 








* Significant at the .01 level. 
} Significant at the .05 level. 


** df reduced from 22 to 17 because of five items of missing data (one set of oral-reading and two 
sets of impromptu-speaking data for elder group II and one set of each kind for elder group I). 
Group means were used as estimates of missing values. 


range, and the standard deviation of 
the distribution of vocal frequencies— 
chistered about data previously re- 
ported for college males. Illustrative of 
this comparability are the data for the 
standard deviation measure: 2.9, 3.0 and 
3.3 semitones for middle-aged, elder 
group I and elder group II, respec- 
tively, compared with the 3.2 semitone 
finding of both Snidecor (21) and 
Hanley (10). These variability meas- 
ures for the three groups investigated 
in the present study showed a tendency 
toward greater variability as a function 
of age. In addition, the results reported 
in Table 1 also indicate a tendency 
toward greater variability during oral 
reading as contrasted with impromptu 


speaking. 


As for measures of time, the follow- 
ing were observed in the data: (a) 
Phonation/time ratio showed a pro- 
gressive reduction in magnitude as a 
function of age in the three groups in- 
vestigated; moreover, generally smaller 
ratios as compared to college males 
were observed. The range for groups 
in the present study was .56 to .48 (in 
order of increasing age), contrasted to 
the .68 and .65 reported by Snidecor 
and Hanley, respectively. (b) Words 
per minute manifested a similar pro- 
gressive reduction in magnitude as a 
function of age in the three groups, as 
well as a similar general lower rate for 
the elder groups as compared to college 
males. Results given in Table 1 also 
indicate that the middle-aged group 
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was characterized by the usually re- 
ported greater oral-reading rate (21) 
as compared to the impromptu-speak- 
ing rate; however, in the elder groups, 
these measures were essentially the 
same. 


Examination for Differences Between 
Elder Groups. In Table 2 are presented 
results of the analysis of variance of a 
series of split-plot designs. Reported 
here are results of tests of significance 
made on the previously discussed dif- 
ferences between the two elder groups 
on the basis of the seven criterion vocal 
variables. 

In measures of pitch central ten- 
dency, the two groups were found to 
be significantly different at the 1% 
level. This reflected the finding that 
elder group II individuals had funda- 
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mental pitches which were significantly 
higher than elder group I individuals. 
To illustrate this general upward dis- 
placement of vocal frequency as a 
function of age, a graphic representa- 
tion of group distributions of vocal 
frequencies in the oral-reading per- 
formance is shown in Figure 1. Al- 
though not included in the analysis of 
variance, data from the middle-aged 
group also are represented in this figure 
for purposes of comparison. In addi- 
tion, the analyses for pitch central 
tendency revealed that there was a 
significant difference between the oral- 
reading and impromptu-speaking meas- 
ures at the 5% level. This result indi- 
cated that pitch central tendency in 
oral reading was significantly higher 
than in impromptu speaking. 


ELDER GROUP I 
----—- ELDER GROUP IZ 
—---—— SON GROUP 
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Ficure 1. The distribution of vocal frequencies for the three groups during oral reading. 
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The analyses for measures of pitch 
variability indicated. that elder group 
Il was characterized by a significantly 
greater total range than elder group 
I (1% level), while no significant dif- 
ferences were found between the 
groups on measures of functional range 
and standard deviation of vocal fre- 
quencies used. 

No significant differences between 
groups were found in analyses of the 
time measures; however, trends toward 
reduction in these measures, as a func- 
tion of age, were observed as has been 
mentioned earlier. 


Familial Relationships. To test for 
family relationships between elder 
fathers and their respective middle- 
aged sons on some of the vocal vari- 
ables studied, a correlation analysis was 
employed. This involved obtaining the 
correlations between the performances 
of 15 sons and their respective 15 
fathers on five vocal variables measured 
during oral reading. The measures 
chosen for analysis were: median fun- 
damental pitch, functional range, stand- 
ard deviation of the distribution of 
vocal frequencies, phonation/time ratio 
and words per minute. 

The results of this series of cor- 
relations revealed that no significant 
degree of association existed between 
father and son on the particular vocal 
variables studied. The obtained coef- 
ficients ranged from .07 to .41. 


Discussion 


Pitch central tendency measures for 
the three groups investigated revealed 
that oral-reading performances were 
slightly higher in pitch level than im- 
promptu-speaking performances. These 
measures manifested a progressive up- 


ward trend as a function of age, which 
may be attributed to certain changes 
in laryngeal physiology, due to the 
aging process, as well as to various 
tension-conditioning factors resulting 
from socio-emotional changes. In the 
case of the middle-aged males, whose 
vocal frequency level of approximately 
110 cps is lower than the level reported 
in studies of college males by Snidecor 
(21), 129 cps, and by Hanley (10), 
118.6 cps, it appears that tension and 
conditioning factors are exerting the 
greater influence. It may be conjec- 
tured that the middle- aged individual, 
who is generally married, settled and 
has a definite vocation, might exhibit 
less generalized tension than the young 
adele who is still to realize most of his 
desired goals; this reduction of tension 
might be reflecting itself in a slightly 
lower pitch level. With respect to the 
elder groups, whose pitch levels mani- 
fested an upward trend, a combination 
of physiological and socio-emotional 
factors may contribute to the trend. 
Such factors as central nervous system 
atrophy, increased blood _ pressure, 
changes in the respiratory system, 
various endocrinological and muscle 
changes (19, 11, 2, 20), are believed 
capable of displacing pitch. As for 
socio-emotional changes, such factors 
as decreasing self-sufficiency, diminish- 
ing personal faculties, forced retire- 
ment, loss of family and friends, and 
various psychical traumata (18, 23, 12) 
are capable of increasing tension and 
anxiety and, in turn, affecting pitch 
level. In addition, further explanations 
might be forthcoming upon the advent 
of investigations of gerontological 
processes operating with reference to 
various feedback mechanisms. In terms 
of the auditory channel, the substantial 
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upward trend in pitch level in elder 
group II might be related to the char- 
acteristic audition changes in this 
group. 

Measures of pitch variability for the 
three groups investigated were almost 
identical for both oral reading and im- 
promptu speaking; however, when 
differences did exist (except for elder 
group II which reflected an opposite 
trend), the measures for oral reading 
were found to exceed slightly those 
for impromptu speaking; these meas- 
ures reflected a general trend toward 
greater variability as a function of age. 

This general tendency might be ex- 
plained on the basis of an experiment 
done by MclIntosh (/4) who found 
that variability increased as pitch level 
was raised. Therefore, those factors 
responsible for the progressive rise in 
pitch level in the three male groups of 
the present study might also be indi- 
rectly responsible for the progressive 
increase in pitch variability. 

Time measures for the three groups 
investigated indicated that oral-reading 
rate was noticeably more rapid than 
impromptu-speaking rate in the son 
group, only slightly more rapid than 
impromptu-speaking rate in elder 
group I and slightly less rapid than im- 
promptu-speaking rate in elder group 
II, while phonation/time ratio for oral 
reading was found to be considerably 
larger than impromptu speaking for the 
son group and elder group I but iden- 
tical in both performances for elder 
group II; these measures evidenced a 
general progressive reduction in mag- 
nitude as a function of age. From a 
physiological frame of reference, this 
pattern of reduction in time measures 
can be interpreted as a function of gen- 
eral slowing of neuro-muscular activ- 


ity, resulting in fewer words per 
minute and the combination of shorter 
phonation periods with greater dura- 
tions of pause time. From a socio- 
emotional standpoint, these results 
might reflect the need for time simply 
to evoke words or to think of the most 
appropriate words in the impromptu- 
speaking performance, and the desire 
to be accurate in the requested oral- 
reading performance. Here again, geron- 
tological changes in auditory, visual, 
and proprioceptive feedback systems 
must be considered. 

When tests of significance are ap- 
plied to the data, the findings reveal 
that adult males aged from 80 to 92 
are characterized by significantly 
higher measures of average fundamental 
pitch level than adult males aged from 
65 to 79. Furthermore, the analyses 
revealed that measures of average fun- 
damental pitch level for oral reading 
are significantly higher than measures 
of average fundamental pitch level for 
impromptu speaking. However, no sig- 
nificant differences were found be- 
tween the two groups on time measures. 

On the whole, the results suggest 
that significant differences do exist be- 
tween the two elder groups and that 
the arbitrary dividing point at. 80 years 
proved to be a rather sensitive one. 
However, it should be remembered 
that this dividing point was arbitrarily 
chosen, and that future studies directly 
concerned with this particular aspect 
of the study might discover an age 
level which is even more discriminating. 

No significant family relationships 
were found, relative to five studied 
vocal attributes, between a group of 
elder fathers and their respective mid- 
dle-aged sons. 

As with Paul’s (16) investigation, 
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which showed only very specific kinds 
of parent-child relationships in terms 
of the intensity characteristic of speech, 
the present investigation does not tend 
to support, experimentally, the obser- 
vations on family relationships in 
speech sometimes reported (1, 24, 25). 
That speech similarities exist between 
parents and children, with reference 
to such attributes as articulation, pitch 
patterning, and voice quality during 
particular developmental periods, may 
yet be demonstrated experimentally. In 
the present case, the combination of 
environmental expansion, with the con- 
comitant expansion of speech expe- 
riences, and individual physiological 
changes as a function of maturation, 
may have disrupted the perpetuation 
of particular speech similarities between 
certain family members. Moreover, the 
present results may also be reflective 
only of the particular samples used and 
the particular statistical technique em- 
ployed, with real familial speech pat- 
terns concealed by the methodology. 


Summary 


The present investigation involved 
the study of two elder male groups and 
a middle-aged group with respect to 
several vocal variables under  oral- 
reading and impromptu-speaking con- 
ditions. The specific purposes were (a) 
normative, to establish tentative norma- 
tive data for the older male groups; 
(b) discriminative, with reference to 
whether statistically significant differ- 
ences exist between two groups of 
differently aged elder individuals; and 
(c) comparative, to test for family 
relationships in speech between certain 
elder fathers and their middle-aged 
sons. 


Within the limitations of the study, 
the following statements may be made: 
(a) For the three groups investigated, 
measures of pitch central tendency re- 
vealed a progressive upward trend, 
measures of pitch variability a general 
trend toward greater flexibility and 
time measures a progressive reduction 
of magnitude, all as a function of age. 
(b) Adult males aged from 80 to 
92 are characterized by significantly 
higher measures of average fundamen- 
tal pitch level than adult males aged 
from 65 to 79; however, no significant 
differences were found between the 
groups on time measures. (c) No sig- 
nificant family relationships were found 
between a group of elder fathers and 
their respective middle-aged sons. 
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A Note On Optimal Vocal Frequency 


ARTHUR S. HOUSE 


In a recent paper Thurman (4) re- 
ported an investigation of a clinical 
procedure for estimating a natural, or 
optimum, vocal frequency. The pro- 
cedure in question requires subjects to 
sing or hum the musical scale while an 
observer locates an involuntary in- 
crease in vocal output alleged to co- 
incide with the vocal frequency under 
search. Thurman’s subjects intoned the 
three vowels /e/, /a/ and /u/ and 
hummed in approximate musical steps 
over their vocal ranges. Their produc- 
tions were recorded and these data 
subsequently were processed to yield 
measures of fundamental frequency and 
relative level. This experimental ex- 
amination failed to support the clinical 
technique in question. 

Since the validation of clinical con- 
cepts and tools is of interest to many, 
it seems appropriate to offer some fur- 
ther comment on the problem. The 
comments that follow are in support, 
essentially, of Thurman’s conclusions. 
They are meant to point out, however, 
that in addition to his experimental 
evidence, a consideration of the mech- 
anism of speech production suggests 
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strongly that the clinical procedures 
in question are predestined not to pro- 
duce an indication of optimum level, 
at least in those cases where vowel 
sounds are produced by the subject. 

When a search for an optimum pitch 
level is made by intoning vowel sounds 
at discrete fundamental frequencies 
throughout a subject’s vocal range 
there probably is a tacit assumption 
(a) that some sort of optimal physical 
relationship among the various com- 
ponents of the vocal mechanism will 
be found, (b) that when this relation- 
ship obtains, the laryngeal output will 
be maximal and will be reflected in a 
maximum in vocal output at the lips 
and (c) that the change in over-all 
level will be perceptible. It might be 
assumed, furthermore, that vocal effort 
is constant during the procedures. Par- 
enthetically, it can be pointed out that 
no good measure of effort is available; 
this problem is discussed briefly be- 
low. ; 

The description of the clinical pro- 
cedure leads one to suspect that its 
proponents intuitively are seeking a 
condition of optimal ‘coupling’ (in 
some undefined sense) between the ac- 
tion of the larynx and the supraglottal 
system. It has been shown, however, 
that the volume velocity at the glottis 
is relatively unaffected by the supra- 
glottal configurations (5). A maximum 
in laryngeal output, therefore, must 
originate from activity within the 
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Figure 1. Idealized pass-band filter function. 
Harmonic lines appropriate for two funda- 
mental frequencies are indicated, as follows: 
' solid lines, Fo=200 cps; broken line-, Fx=275 
cps. 








larynx itself and not from some so- 
called coupling effects. This note at- 
tempts to show that maxima in vocal 
output can be expected when the la- 
ryngeal output is constant. Further- 
more, it points out that without elabo- 
rate controls it would be difficult to 
detect maxima in laryngeal activity 
from observations on the vocal output. 


The relative independence of the 
glottal and supraglottal effects makes 
possible the mathematical description 
of speech sounds by specifying the 
source function and the filter function 
which together determine the system 
function (J, 2). Such a specification in- 
dicates that the glottal output serves to 
excite a resonating system whose char- 
acteristics help determine the various 
spectra associated with vowels. The 
harmonic spectrum of vowels is de- 
rived from the glottal impulses and is 
well known, as is the location of the 
spectral maxima or formants of vowel 
sounds. Probably less well understood 
is the fact that the spacing of the har- 
monics of a vowel sound is independent 


of the center frequencies of the reso- 
nances of the vocal tract. 

The effect of this independence is 
exemplified in Figure 1 which shows 
an idealized filter of known half-power 
band width (200 cps) excited by a sig- 
nal with a harmonic spectrum consist- 
ing of a fundamental and overtones. 
When the fundamental frequency is 
200 cps, the harmonics are at 400 cps, 
600 cps, . (the solid spectral lines), 
and the relative over-all level of this 
signal i is approximately 21.5 db’. That 
is, the over-all level is almost identical 
with the level of the harmonic situated 
at the center frequency of the filter. 
When the fundamental frequency is 
raised to 275 cps, the harmonics are at 
550 cps, 825 cps, ... (the dashed spec- 
tral lines), and the relative over-all level 
of the signal is.reduced to approxi- 
mately 18.5 db, a level about |}3 db 
greater than the second harmonic. The 
mechanism is analogous to the excita- 
tion of the vocal tract by the glottal 
source, except that since in natural 
vowels the band width of the first 
formant is considerably less than 200 
cps, the variation in level could be 
greater. 


To obtain an estimate of the degree 
of variation in over-all level that can 
be expected in vowels, idealized first 
formant functions appropriate to /u/, 
/e/, /a/ were constructed. These func- 
tions were derived primarily from 
studies of vowel synthesis using. res- 
onant circuits and agree well with theo- 
retical expectations. In the analyses 
that follow only the first formant is 
considered, since the higher formants 


In this discussion, over-all levels are com- 
puted by adding the squares of the ampli- 
tudes of the harmonics and expressing the 
results in decibels. 
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are far enough below the level of the 
first formant to contribute very little 
to the over-all level of the vowel’. To 
simplify the graphic constructions, the 
center frequencies of the first form- 
ants of the vowels were chosen to be 
300 cps, 500 cps and 700 cps for /u/, 
/e/ and /a/, respectively, and the half- 
power band widths of the formants 
were made to be approximately 100 
cps. Since these formant values re- 
semble closely those usually attributed 
to male talkers, 100 cps was taken as 
the initial fundamental frequency and 
the harmonics of this fundamental 
were constructed. The fundamental 
frequency was then varied system- 
atically and similar graphic construc- 
tions were made and the relative over- 
all levels under the first formant curves 
were computed for the various line 
spectra. These procedures were fol- 
lowed for the three ‘vowels.’ 

The results of the measurements are 
shown in graphic form in Figure 2. 
In the figure the abscissa shows the 
fundamental frequency of the glottal 
source excitation and the ordinate in- 
dicates the over-all level of the vocal 
output. Each curve shows as a function 
of frequency the levels associated with 
a specific first formant whose center 
frequency is constant. The curves were 
normalized by setting equal the am- 
plitudes of the formants at center fre- 
quency. 

The data indicate that extensive, 
and presumably perceptible, changes in 
over-all level will occur when the ar- 
ticulatory configuration of the vocal 


“When the spectral comporients of an 
idealized vowel /u/ with F:=300 cps and 
Fo=100 cps are examined, it can be shown 
that the partials higher than the lowest five 
raise the over-all level considerably less than 
0.5 db. 
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Figure 2. Relative over-all levels as a func- 


tion of frequency for three idealized vowels. 
Estimates restricted to energy close to first 
formant (F:). Center frequencies of Fi at 
300 cps, 500 cps and 700 cps. All formant 
band widths are approximately 100 cps. Over- 
all levels normalized by equating amplitudes 
of first formants. An assumption of constant 
dc volume velocity at the glottis is made. 


tract remains fixed and the fundamen- 
tal frequency of the voice is changed. 
The variations, furthermore, are not of 
the kind sought by the clinician en- 
deavoring to identify optimum level, 
since a maximum is associated with 
each subharmonic of the center fre- 
quency of the first formant. 

If natural speech is to be discussed, 
it must be pointed out that the curves 
in Figure 2 embody an assumption 
about the operation of the glottis, that 
is, that there is a constant dc volume 
velocity through the glottis. It is in- 
teresting to speculate that this physical 
condition correlates positively with 
‘constant effort’ on the part of the 
talker. 

Using these assumptions it can be 
said, for example, that if the glottis 
generates an impulse of amplitude x 
every 10 msec (that is, produces a 100- 
cps signal at a known level), then rais- 
ing the vocal frequency by one octave, 
while maintaining constant effort, means 
producing twice as many impulses per 
unit time, each impulse having an 
amplitude of x/2. On the other hand, 
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Ficure 3. Relative over-all levels as a func- 

tion of frequency for a vowel with F, at 

500 cps. Upper curve assumes constant pulse 

amplitude at glottis; lower curve assumes 

constant de volume velocity at glottis. 


it can be assumed that constant effort 
is a condition of equal amplitude of 
glottal impulses. If this is the case 
there would be a gain proportional to 
frequency in the level of the har- 
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monics of the output. The major con- 
sideration here, however, is whether 
the latter assumption will change the 
range of variation in over-all levels pre- 
dicted by the curves in Figure 2. The 
curves in Figure 3 demonstrate the 
effect of changing the assumption and 
indicate that the question can be an- 
swered in the negative. 

Each curve in Figure 3 describes the 
variations in over-all level when the 
center frequency of the first formant 
is at 500 cps. The upper curve as- 
sumes equal impulse amplitudes (and 
changes in the amplitudes of harmon- 
ics); the lower curve assumes constant 
dc volume velocity at the glottis or 
changes in impulse amplitudes (and 
equal harmonic amplitudes). 

It may be of interest to examine this 
situation in the time domain. Figure 4 
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Ficure 4. Damped exponential wave forms showing phase effects as fundamental frequency is 


changed. 
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Ficure 5. Relative over-all levels as a func- 
tion of frequency for three idealized vowels 
with first formants appropriate to female 
talkers. Center frequencies of F; at 320 cps, 
620 cps and 840 cps. All formant band widths 
are approximately 100 cps. Over-all levels 
normalized by equating amplitudes of first 
formants. An assumption of constant dc 
volume velocity at the glottis is made. 


shows trains of exponentially damped 
wave forms similar to the supraglottal 
(vocal) output. In the upper part of 
the figure the wave forms occur every 
10 msec, that is, the fundamental fre- 
quency is 100 cps, and the period of 
the first resonance is 2 msec, that is, 
the frequency is 500 cps. After the 
first period of the fundamental the 
wave forms are in phase and will sum- 
mate to produce maxima. 


In the lower part of the figure 
the fundamental frequency has been 
doubled, that is, the wave forms occur 
every 5 msec, but the first resonant 
frequency remains at 500 cps. In this 
case the subsequent wave forms do 
not summate completely but will tend 
to reduce the level of the signal. 

In the recent experimental attempt 
to validate the pitch-finding technique, 
the probability of failure is raised by 
the presence of both males and females 
among the experimental subjects. As is 
well known, on the average the vo- 
cal-tract dimensions of females will pro- 
duce formants with center frequencies 
displaced upwards from corresponding 
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male values (3). This systematic dif- 
ference plus the wider spacing of har- 
monics in the vocal range of female 
voices will introduce variation into the 
experimental results. In Figure 5, for 
example, are the relative over-all levels 
of ‘vowels’ /u/, /e/ and /a/ with cen- 
ter frequencies of their first formants 
more appropriate to female talkers and 
with fundamental vocal frequency 
varying over an appropriate range. 
Comparison of these curves with those 
in Figure 2 shows the maxima and 
minima to be displaced but to recur 
systematically. 

It can be asserted, of course, that 
the analogy to natural speech processes 
is not perfect. If the so-called vocal 
swells are sought by modulating the 
fundamental vocal frequency continu- 
ously in time, the variation in over-all 
level probably will not be as great as 
those delineated above. Similarly, dur- 
ing natural speech the center frequen- 
cies of the formants would not remain 
as stationary as in the model—the ar- 
ticulatory configuration is not easy to 
maintain and the talker might habit- 
ually ‘tune’ the resonators by means of 
auditory cues—and this presumably 
could reduce the range of level varia- 
tion. In addition, neither assumption 
made here about vocal effort might de- 
scribe this phenomenon adequately. It 
is felt strongly, however, that these 
limitations would not reduce the range 
of variation sufficiently to obliterate 
perceptible changes in over-all level 
that are not attributable to optimal la- 
ryngeal operation. 

In conclusion, the theoretical con- 
siderations outlined above could ac- 
count for most of the seemingly un- 
lawful variations in the data reported 
by Thurman. They imply, further- 
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more, that tests similar to Thurman’s 
will yield indecisive results unless elab- 
orate experimental controls are added. 
It would seem, however, that at the 
present time the cost of realizing such 
controls is so great as to suggest that 
the most profitable approach to this and 
similar problems is through a considera- 
tion of theoretical models or experi- 
ments utilizing synthetic speech. 


Summary 


The article describes a physical char- 
acteristic of vowel production suff- 
cient to account for systematic and pre- 
sumably perceptible variations in over- 
all vowel level as a function of vocal 
frequency. The discussion supports the 
conclusion that certain traditional 


methods advocated for locating opti- 
mum pitch levels are not adequate. 
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Relationships Among 


Three Filmed Lip-Reading Tests 


JOHN J. O'NEILL 


MARY C. STEPHENS 


One of the major problems encountered 
in any study of lip-reading skill is the 
determination of the degree of lip-read- 
ing skill possessed by experimental sub- 
jects. This problem involves not only 
the selection of suitable materials for 
such a test but also the establishment of 
the reliability and validity of the test. 
Experimenters have employed several 
approaches in their construction of lip- 
reading tests. Some employ only single 
words, some employ sentences, while 
others have developed Coe skits. 
Still unanswered is the questien of 
which sort of recall, that is, recall of 
individual elements (words) or recall 
of thought units, is most representative 
of lip-reading skill. Most of the cur- 
rent lip-reading tests employ the indi- 
vidual unit method of scoring. Lashley 
(3, pp. 112-146), however, in speaking 
of language behavior, disputes the ‘as- 
sociative chain’ theory of language 
recognition. He states that in speech 
the movement elements (sounds or 
words) occur in so many permutations 
that individual associations between 
pairs of them is improbable. Temporal 
order is imposed upon the elements of 
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speech by some preceding organization. 
Temporal order, in the instance of 
language, is imposed by the set or idea 
to be expressed. 

The problem of test reliability is the 
easiest to investigate. But the problem 
of determining the validity of lip-read- 
ing tests is difficult. Because of the 
aforementioned difficulty of describing 
what constitutes skill in lip reading, the 
testing instrument itself becomes the 
only method of ascertaining the lip- 
reading skill of an individual. Thus, 
until more definitive criteria for the 
description of lip-reading skill are avail- 
able, determination of test validity will 
remain a problem. Greene (2, p. 57) 
suggests five possible ways to establish 
the validity of a test. Two of these 
are (a) correlation with other similar 
measures and (b) correlation with the 
judgment of authorities in the area. 

At the present time the majority of 
lip-reading tests are filmed motion-pic- 
ture tests. Use of this technique allows 
for control over the constancy of 
speech movements and the visibility of 
speech presentation. The two best 
known tests are those prepared by Ut- 
ley (6) and Mason (4). Another film 
series, Life Situation Films, produced 
by Morkovin (5), while a training 
series, can be adapted to a test situation. 
To the present time no attempt has been 
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made to investigate whether the various 
tests measure the same abilities. 

If it were possible to demonstrate 
high correlations among tests based 
on somewhat different approaches to 
the measurement of lip-reading skill, 
greater faith in their common validity 
might be warranted. Such correlations 
would mean that the ‘common sense’ 
judgment of considerable intrinsic va- 
lidity in each test was not refuted by 
statistical evidence indicating each was 
measuring largely unique skills. 

The purpose of this study was to in- 
vestigate the adequacy of several filmed 
tests of lip-reading ability. Major at- 
tention was devoted to the determina- 
tion of how a sample of hard-of-hear- 
ing subjects performed on three tests 
of lip-reading skill. This type of ex- 
perimental sample differs from that 
employed in previous studies, which 
used normal-hearing or deaf subjects. 


Procedure 


The subjects were 26 hard-of-hearing 
adults. All were enrolled in regularly 
scheduled lip-reading classes at the 
Columbus (Ohio) Hearing Society. 
The age range of the subjects was 
from 20 years to 60 years, with a mean 
age of 45.4 years. The length of enroll- 
ment in lip-reading instruction ranged 
from 12 weeks to 25 years. Three si- 
lent motion-picture films designed for 
the testing of, or training of, lip- 
reading ability were chosen as the 
stimulus materials. In order to com- 
pensate for possible learning factors or 
fatigue on the part of the subjects, the 
films were shown in six counterbal- 
anced orders. There were six group 
showings with similar experimental 
conditions for all the groups. Each 
group viewed all three films in one 


evening. The entire experimental ses- 
sion required approximately 1.5 hours. 


Tests. The Mason Film 30 is the last 
in a series of training films designed to 
aid in lip-reading instruction for adults. 
The film was specifically designed by 
Mason (4) at The Ohio State Univer- 
sity as a test of lip-reading skill. It 
contains 20 complete sentences, total- 
ing 330 words; the mean sentence 
length is 16.5 words. This film is a 16 
mm color film with a running time of 
seven minutes, not including inter- 
ruptions for transcription. The film 
shows the head and shoulders of a col- 
lege male who speaks the sentences. 

The Utley Lip-reading Test Film 
was constructed by Utley (6) at 
Northwestern University and is en- 
titled ‘How Well Can You Read Lips?’ 
The film is divided into three sections: 
Part I, sentences and short phrases, 
Part II, words; and Part III, stories. 
The film is available in two forms, A 
and B; Form A was employed in the 


_ present study. Part I contains 31 short 


phrases and sentences designed to be 
representative of common, frequently- 
used word combinations. The total 
number of words in this section is 124; 
the mean sentence length is 4.0 words. 
The sentences are spoken by an adult 
female who holds up an appropriately 
numbered card before speaking each 
sentence or phrase. The speaker is 
photographed from the waist up, facing 
the camera; occasional head movements 
are made during the presentation of 
the material. The film is photographed 
in black-and-white on 16 mm film with 
15-second strips of unexposed film be- 
tween adjacent presentations of the 
sentence materials. Total running time 
is 12 minutes. 

Part II, which contains 36 isolated 
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words spoken by the same female, was 
not utilized in the present study. Part 
III contains six skits or stories, the first 
five skits involving two persons, and 
the sixth, one person speaking into a 
telephone. This section of the film was 
photographed in color. Only the first 
three stories in this section of the film 
were used in the present study. These 
three episodes are enacted by an adult 
female in a mother role and an elemen- 
tary school boy in a son role. Five 
questions pertaining to each story fol- 
low each episode. The running time of 
this section of the film is eight minutes. 


The Morkovin Life Situation Film 
Number 101 is the first in a series of 
10 training films depicting a life situa- 
tion and is entitled “The Family Din- 
ner.” The Life Situation Series was 
constructed by Morkovin and Moore 
(5) at the University of Southern Cali- 
fornia. The film has four speaking 
characters: an adult male in a father 
role, an adult female in a mother role, 
a 17-year-old girl in a daughter role 
and an 11-year-old boy in a son role. 
The scene enacted is that of a family 
dinner and, although placed at varying 
distances from the characters during 
the filming, the camera was focused on 
the face of the person speaking. Props 
and situational cues are used freely. 
The film is photographed in black-and- 
white on 16 mm film and runs con- 
tinuously for five minutes. The manual 
used with the Morkovin film includes 
three groups of questions to be em- 
ployed in evaluating comprehension of 
the film. The first group includes ques- 
tions designed to evaluate the individ- 
ual’s attention to the non-verbal cues 
presented in the film. The questions in 
the second group relate more directly 
to the conversation employed in the 


film while the third group, which is a 
multiple-choice type of test, is utilized 
to test over-all comprehension. It was 
decided that the second group of 19 
questions would serve as a better meas- 
ure of an individual’s comprehension 
of the film, especially in terms of 
‘thought units.’ 


Results and Discussion 

Comparisons with Mason Film. The 
criterion measure on the Mason test 
was the total number of words cor- 
rectly identified. Scores were com- 
pared by means of the Pearson r pro- 
cedure with scores obtained on Parts 
I and III of the Utley test and on the 
Morkovin test. The three obtained cor- 
relation coefficients ranged from .49 
to .56 and were significant at or be- 
yond the 5% level. These results in- 
dicate that there is a better than chance 
relationship between scores on tests 
based on the Mason film and the Utley 
and Morkovin films. These findings are 
of some interest in that Part III of 
the Utley film and the Morkovin film 
purportedly deal with thought content 
rather than individual word recogni- 
tion. It would appear that the group- 
ing of thought content and the recog- 


nition of individual words involve 
somewhat similar skills. 
Comparisons Between Utley and 


Morkovin Films. Scoring on Part I of 
the Utley test was on the basis of one 
point per correctly recorded word. In 
Part III of the film a weighted score of 
three was arbitrarily assigned for each 
correctly recorded idea. "Such a scor- 
ing system is similar to that employed 
for the Morkovin film. The correla- 
tions computed between the scores ob- 
tained on the two Utley tests and the 
Morkovin test (.26 and .27) were not 
significant. There is thus no evidence 
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on the basis of these results (a) that 
the two tests (Morkovin and Part III 
of Utley), which purportedly measure 
recognition of thought units, measure 
the same skills nor (b) that recognition 
of thought units (Morkovin) is related 
to word recognition (Part I of Utley). 
In the light of present results, it would 
therefore appear that the Utley test 
and the Morkovin test cannot be 
thought of as interchangeable tests of 
lip-reading ability. 

Comparisons with Morkovin Film. 
As was indicated previously, the cor- 
relation between scores obtained for 
the Mason test film and for the Morko- 
vin test film was significant while the 
correlations obtained between the scores 
on the Morkovin film test and the 
scores on the two sections of the Utley 
film test were not. It should be reiter- 
ated that the Morkovin film was not 
originally constructed to be a lip- 
reading test. It was one of a series of 
10 films designed to be used in an or- 
ganized teaching program. The films 
were planned to assist lip-reading stu- 
dents to develop an orientation to con- 
textual details. Thus, the ordinary stu- 
dent of lip reading might not give as 
good an account of himself on the film 
as might a person who had had some 
training in the Morkovin method. It 
does not necessarily follow, however, 
that the film cannot be employed as a 
test of lip-reading ability. It seems 
possible that the Morkovin film, in 
light of the obtained significant cor- 
relation between scores on the Morko- 
vin and scores on the Mason film, is 
more of a word recognition test than 
a thought recognition test. This state- 
ment, however, must be accepted with 
caution since the construction of the 
questions used in this investigation may 
have biased the results in such a direc- 


tion. Nevertheless, it is difficult to ex- 
plain the relatively low correlation be- 
tween scores on the Morkovin film and 
the Utley test, Part III, since they em- 
ploy similar testing material. These 
anomalous findings are possibly at- 
tributable to sampling errors in either 
or both of the Morkovin-Mason or 
Morkovin-Utley III coefficients. Di- 
Carlo and Kataja (1) suggest that the 
fault lies with the Utley test and not 
with the Morkovin test. However, their 
results, which were based on questions 
pertaining to the Morkovin film, were 
actually the results of a form of face- 
to-face lip-reading test. Thus they were 
measuring lip-reading ability at the 
same time that they were supposedly 
testing recall on the Morkovin film. 
Therefore, the results of their study 
may not be applicable in the present 
discussion. 

Comparisons Between Instructor Rat- 
ings and Test Scores. Three lip-read- 
ing instructors associated with the Co- 
lumbus Hearing Society rated the ex- 


’ perimental subjects in two ways: (a) 


by rating on a scale from one to five 
the lip-reading ability of the subjects 
and (b) by a rank order listing of the 
subjects in terms of their lip-reading 
ability. The rating-scale value was ob- 
tained from the instructor who had had 
the most recent instructional experience 
with the subject. For the rank order 
listing, 19 subjects were ranked by 
only one instructor, six subjects were 
ranked by two instructors and one sub- 
ject by all three, with a resultant rank 
order listing of 12 subjects for two 
of the three instructors and 10 sub- 
jects for the other. 

The rating-scale values were corre- 
lated with the test scores obtained on 
each of the three lip-reading tests 
(Mason, Utley and Morkovin). The 
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obtained Pearson rs were significant for 
all four comparisons. The largest ob- 
tained correlation with teacher ratings 
was .60 for scores on the Utley, Part I, 
and the lowest was .49 for scores on 
the Utley, Part III. It is of interest to 
note that Utley (6) obtained a corre- 
lation of .42 between teacher ratings 
and the total test score (Parts I, If and 
III). The present results seems to cor- 
roborate Utley’s conclusion that it 
would be inadvisable to use instructors’ 
judgments in evaluating lip-reading 
ability. It is unclear, however, whether 
the judges or the tests yield the more 
valid measurement. If the instructors’ 
ratings were accepted as the superior 
measure, it might be justifiably con- 
cluded that the validity of the tests 
needs to be substantially improved. Ut- 
ley, on the other hand, seemed more 
inclined to accept the test results than 
instructors’ ratings. The correlation co- 
efficients clearly provide no crucial 
evidence for the superiority of either 
measure. Before further research of 
this kind is undertaken, it would seem 
advisable to determine how much such 
correlations are being attenuated by un- 
reliability in both measures. 

Rank order correlations were com- 
puted to compare rankings made in- 
dependently by each of three instruc- 
tors with rankings obtained on the tests. 
The results can be summarized as fol- 
lows: Teacher I rankings (N = 12) 
correlated significantly with only the 
Morkovin test scores (p = _ .63); 
Teacher II rankings (N = 12) corre- 
lated significantly only with scores on 
the Utley Test, Part III (p = .72); 
Teacher III rankings (N = 10) cor- 
related significantly with the Mason, 
Utely I and Morkovin test scores (p 
== .84, .84 and .87, respectively). The 
pattern of relationships from teacher 


to teacher is apparently rather in- 
consistent, with wide variation in the 
extent of agreement between teacher 
rankings and test rankings. 


Summary 


This study was conducted to in- 
vestigate (a) relationships among sets 
of scores obtained from 26 hard-of- 
hearing subjects on three silent motion- 
picture film tests (Mason, 30; Utley, I 
and III; Morkovin, 101) designed to 
evaluate lip-reading ability, and (b) 
relationships between teacher ratings of 
lip-reading ability and scores achieved 
on each of the three tests. Test scores 
based on the Mason film test of lip- 
reading skill correlated significantly 
with test scores based on both the 
Utley and Morkovin films. However, 
the Utley film test scores did not cor- 
relate significantly with the Morkovin 
test scores. Teacher ratings of lip- 
reading ability on a five-point scale 
correlated significantly with test scores 
on each of the three film tests of lip- 
reading ability. However, teacher rank- 
ings of the subjects with regard to lip- 
reading proficiency correlated  sig- 
nificantly with test scores in only five 
out of 12 comparisons. 
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Transition And Release 
As Perceptual Cues For 


WILLIAM S-Y. WANG 


The perception of speech in its every- 
day form involves at least two sets of 
variables (8): the physical information 
present in the acoustical wave and the 
linguistic code with which the listener 
interprets the physical information. 

In the perception of final, postvocalic 
plosive consonants in noncontextual 
monosyllables, the physical information 
includes: (a) the duration of the pre- 
ceding vowels (3), (b) the formant 
transitions (6) and (c) the duration 
of the voicing that may follow the 
formant transitions. When a plosive is 
released, or when there is a cluster of 
two plosives, additional cues may be 
found in (a) the duration of the gap 
(7) and (b) the duration, (c) the in- 
tensity and (d) the spectral properties 
of the release (2). In English, the first 
plosive of a cluster is frequently not 
released but is signalled by the form- 
ant transitions alone and the second 
plosive is signalled by the release. 

The linguistic code, with which the 
listener interprets the physical infor- 
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mation, conditions him to perceive the 
consonants in the following cases with 
varying degrees of proficiency: (a) 
when the consonants do not occur in 
his native language; (b) when the con- 
sonants are in a sequence that does not 
occur in his language (1); (c) when a 
syllable containing the consonants con- 
forms to the phonotactic (phonetic se- 
quence) rules of his linguistic code but 
is not a word to him; (d) when the 
syllable is a word of low frequency of 
occurrence in his language, and (e) 
when the syllable is a high frequency 
word (9). 

If the native language of the listener 
is English, the above mentioned cases 
can be exemplified by the five syllables 
[siq], [sitp], [sig], [sikt], [sit]. These 
syllables are in the order of increasing 
familiarity to the English-speaking 
listener. It would appear probable that 
he will find them increasingly easy to 
perceive correctly. 


Methods in Studying Speech Per- 
ception. Within the last decade, inves- 
tigations on the perception of speech 
have been rather intensively pursued 
by means of electronically synthesized 
speech (6, 17). An advantage of gen- 
erating speech electronically is that, in 
certain respects, greater precision and 
ease of control in the production of 
sounds can be achieved with calibrated 
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Tasie 1. Monosyllabic words used as the basic 
corpus. The left column lists the initial consonants 
for each row; the top row lists the final consonants 
for each column. 











p pip pit pick pig 
b bit bib bid big 
s sip sit sick sib Sid 

r rip rick rib rid rig 








instruments than with the human 
speech mechanism. Studies with elec- 
tronic speech usually simplify the 
acoustical pattern of actual speech by 
deleting certain parameters and _ iso- 
lating and varying certain others. 
Listeners’ responses are then correlated 
with the systematic changes made in 
the synthesized speech. 

Experiments also have been carried 
out using samples of normal speech, 
usually in the form of intelligibility 
tests. While there are more variables 
to control in natural speech, it yields 
results whose direct relevance to ac- 
tual speech need not be justified. Vari- 
ous types of masking noise, clipping, 
phase shifts and time and frequency 
distortion have been used to study the 
perceptual properties of speech sounds. 


Procedure 


The present study used human 
speech and systematic modifications of 
it as the test stimuli. It attempts to in- 
vestigate the relative significance of 
various acoustical cues in the percep- 
tion of final plosive consonants. The 
method involved deleting or inter- 
changing these cues. 


Test Materials. Table 1 lists the 
monosyllabic words which were se- 
lected. Each contained the vowel [1] 
and ended in a plosive. These words 
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were recorded with all the plosives 
released and with what was believed 
to be uniform prosodic features. The 
words were then copied four times 
onto new magnetic tape at a speed of 
30 in./sec. These four sets of mono- 
syllables are hereafter referred to as 
Sets A, B, C and D. 

Set A was unmodified. The releases 
of Set B were removed by cutting 
away the tape approximately seven 
centiseconds after the termination of 
the formants. The voiceless releases of 
Set C were replaced by voiced releases 
of the same articulatory positions and 
vice versa. In Set D the natural releases 
were replaced by releases from other 
articulatory positions, without changing 
the feature of voicing. For Sets C and 
D, an attempt was made to maintain 
the tape junction midway between the 
termination of the formants and the 
spike of the release. The syllables which 
Taste 2. The four sets of syllables used in listen- 
ing tests. Set A contains unmodified monosyllabic 
words. The hyphens indicate the places of tape 


junction where the releases were deleted or inter- 
changed. 














Set A Set B Set C Set D 
prpp] pip-| [ptp-k] 
sipp] sip-] {stp-b] [stp-t] 
ripp] rip-] {r1p-b] [r1p-k] 
pitt] pit-] [pIt-p] 
brtt] brt-] [brt-d] [bit-t] 
sitt] sit-] [stt-d] [stt-p] 
ptkk] pik-] [p1k-g] [ptk-p] 
stkk] sik-] [stk-t] 
rikk] rik-] [rik-g] [rtk-k] 
brbb] bib-] [bib-d] 
stbb] stb-| [stb-p] [stb-d] 
ribb] rib-] [r1b-p} [rtb-g] 
{b1dd] bid-] [brd-t] [b1d-b] 
[stdd] sid-] {[std-t] {std-b] 
|r1dd] rid-| {r1d-g] 
{pige! pig-] [pig-k] [pig-g] 
[bigg] big-| [big-d] 
(rigg] rig-] [rig-k] [r1g-b] 
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SET A 





s I t= 2a 
Figure 1. 
hyphens indicate the places of tape junction. 


were used in Sets A, B, C and D were 
English words if they were considered 
to end in the plosive signalled by the 
cues either before or after the tape 
junction. However, when these syl- 
lables are considered as ending in clus- 
ters, then only those syllables which 
end in a non-geminated [t] or [d] 
conform to the phonotactic patterns of 
English. Due to a mistake in tape- 
splicing, the syllables [r1g-] and [prk-] 
of set B were eliminated from the test 
materials. The remaining 64 syllables are 
listed in Table 2. 

Spectrograms of one syllable from 
each set are shown in Figure 1. The 
first letter after the vowel represents 
the consonant cues present before the 
tape junction and the second letter rep- 
resents the consonant cues present after 


Broad band spectrograms of one syllable from each of the four test sets. The 


the tape junction. The hyphens in- 
dicate the places of the tape junction. 

The 64 syllables were mixed and re- 
corded in random order onto a test 
tape with approximately four seconds 
between syllables. 


Subjects. The test tape was played to 
two groups of listeners. Listeners of 
Group I all had had rather intensive 
training in phonetics. All but one either 
had field experience in linguistics or 
had taught courses in language or pho- 
netics. Listeners of Group II had vir- 
tually no background in phonetics but 
were all native speakers of American 
English. Most of the listeners in both 
groups were graduate students at the 
University of Michigan. The test tape 
was played to each listener individually 


_— fo S 
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through a pair of PDR-8 earphones at 
a constant setting. Listeners of Group | 
were informed of the make-up of the 
syllables, as described in the preceding 
paragraphs here, and were asked to 
transcribe the syllables phonetically. 
Listeners of Group II were given mul- 
tiple-choice answer sheets which did 
not permit clusters as possible respon- 
ses. For example, for a test-tape syl- 
lable which begins with [p] the choices 
on the answer sheet were the words in 
the top row of Table 1, pip, pit, pick, 
and pig. Group II subjects were asked 
to circle the word which they believed 
the syllable to be. In most cases, five 
syllables were played for the listener 
(in either group) from the latter part 
of the test tape in order to orient him 
to the test conditions. 

The two groups consisted of 20 
listeners each. Multiplied by 64 syl- 
lables the total number of elicited re- 
sponses would have been 2560. There 
were nine question marks and seven 
alternative identifications, most of 
which were volunteered by the listen- 
ers of Group I. (To facilitate compari- 
son, the results of the listening test 
are presented on a percentage scale in 


all the graphs.) 


Results 


Measurement of Duration, Spectro- 
grams of the 64 syllables were made 
for the measurement of the durations 
of the various vowels and gaps. It was 
found that although no extra effort was 
made to control the vowel lengths in 
recording the syllables, 57 of the 66 
vowels were approximately 12 csec in 
duration (within 1 csec of this value). 
It is interesting to note in passing that 
the five short vowels at 9 and 10 csec 
were all before voiceless consonants 
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4 % of Relecse judgments 
Figure 2. The percentage of judgments based 
on either transitions or releases for the Set C 
syllables. The solid lines are to be read from 


left to right and the broken lines from right to 
left. 
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Ficure 3. The percentage of judgments based 
on either transitions or releases for the Set D 
syllables. 


(3) and that the four vowels at 14 csec 
were all before voiced consonants. 
For Set B the cuts were made mostly 
at 7 and 8 csec from the termination of 
the formants. The lengths of the gap 
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of Set A ranged from 14 to 28 csec. 
The tape junctions for Sets C and D 
occurred approximately midway be- 
tween the termination of the formants 
and the spike of the release. 


In the examination of the data no 
significant correlation was found be- 
tween the perception of the plosives 
and the duration of the gaps. 


Transition Versus Release. In Figures 
2 and 3 the plosive cues before the tape 
junction are listed in the column at the 
left of the graphs and the correspond- 
ing plosive cues after the tape junc- 
tion are listed in the columns at the 
right. The solid lines, to be read from 
left to right, indicate the percentage 
of judgments based on the cues of the 
left columns; the corresponding broken 
lines, from right to left, indicate the 
percentage of judgments based on the 
cues of the right columns. The inter- 
mediate spaces between the solid and 
the broken lines represent the judg- 
ments based on neither of these two 
types of cues. 


To illustrate, the top row in Figure 
2 shows the Group I responses to Set 
C syllables which end with [p-b]. 
Here 5% of the listeners identified the 
syllables to end in [p] and 67144% in 
[b] while 2714% transcribed clusters 
or single consonants except for [p] 
and [b]. 

The general pattern of Figure 2 sug- 
gests that the homorganic clusters of 
Set C were judged more as voiced 
plosives in both the voiced-voiceless 
and the _ voiceless-voiced sequences. 
This can perhaps be explained by the 
fact that Set C syllables all had voicing 
through half the duration of the gaps. 
The listeners were apparently reacting 
more to the presence of this cue than 
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Ficure 4. The percentage of correct judg- 
ments for Sets A and B. 


to the significant lack of it in the other 
half of the gap. 

Another fact to be observed was the 
high percentage of judgments based 
on the pre-junction cues of the velar 
plosives. Spectrograms made of these 
syllables showed maximum movement 
in the formants for the [k] and [g] 
transitions. This suggests that the value 
of formant transitions as perceptual 
cues for consonants depends on the 
magnitude of the formant movements, 


‘ which, in turn, depends on the artic- 


ulatory positions of the particular 
vowel and consonant involved. 

The combined effect of the above 
two parameters is best seen in the bot- 
tom row. Here, the formant transitions 
for the voiced velar plosive dominated 
97.5% of the responses. 

Set D syllables in Figure 3 also 
showed a high percentage of judgments 
based on the velar transitions for Group 
II listeners. For Group I, the clusters 
which began with velar transitions had 
a high percentage of correct cluster 
identification. 

It appears that the clusters of the 
Set D syllables were identified more 
frequently by the cues present in the 
releases than they were by cues present 
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in the formant transitions, whereas the 
Set C clusters were more evenly di- 
vided in this respect. 


Correct Identifications. In Figure 4, 
the numerals along the ordinate in- 
dicate the percentage of correct iden- 
tifications for the respective plosives 
listed along the abscissa. The upper 
curves in the two graphs represent the 
scores of the two groups of listeners 
for Set A syllables, the lower curves 
for Set B syllables. With respect to 
both Set A and Set B, the phonetically 
untrained listeners made more correct 
identifications than those phonetically 
trained. This can perhaps be partly ex- 
plained by the fact that there was 
greater difficulty in transcribing the 
syllables than there was in simply se- 
lecting an answer. Also, the Group II 
listeners, all of whom were native speak- 
ers of English, were little handicapped 
by their lack of phonetic training since 
these two sets or syllables were all 
normal English words. 

Although the released plosives 
were almost always better identified, 
the releases made a greater difference 
in reinforcing the identity of the voice- 
less plosives for both groups of listeners. 
This observation seems to be explain- 
able by the fact that before terminal 
junctures in informal American Eng- 
lish, voiced plosives are released less 
frequently than are voiceless plosives. 
Consequently, although the releases 
were additional cues for identifying 
the plosives, the voiceless ones were 


more helpful. 


In recent years, there has been much 
interest in relating speech perception 
to the influence of language back- 
ground. (One listener from Group I, in 
whose native language final plosives are 
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always released, made 40% of her 
errors in identifying the unreleased 
plosives.) To test such cross-language 
influences on perception, it would be 
preferable to use test items which are 
less conditioned to a particular lin- 
guistic code than the ones used here. 

The average percentages of correct 
identifications for Set C and Set D 
clusters fell to 36% and 32%, respec- 
tively. For Set C, clusters of the voiced- 
voiceless sequence were correctly iden- 
tified much more frequently than the 
voiceless-voiced clusters. This seems to 
check with the previous observation 
that voiceless releases are more helpful 
as identificational cues and appears to 
follow the general tendency in Ameri- 
can English to devoice a sound in the 
final position. This evidence seems to 
suggest that the sequence of occurrence 
of the members of a cluster affects the 
identifiability of the cluster. An analy- 
sis of the wrong cluster identifications 
revealed that more than 70% of these 
errors contained the correct members 
but in reversed sequence. Perhaps this 
evidence can be explained eventually 
by the effects of time smear. 

The four curves for Set D syllables 
were grouped partly according to 
‘units of difference in distinctive fea- 
tures’ (10). The members of the first 
three clusters were different from each 
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Ficure 5. The percentage of correct judg- 
ments for Sets C and D. 
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other by three units. The members of 
the next four clusters were different 
by four units. The next four clusters 
conformed to the phonotactic rules of 
English, although the members were 
different by only three units. In Eng- 
lish, two plosives cluster before junc- 
tures when only the second is as an 
alveolar plosive, which frequently oc- 
curs as the past tense morpheme. The 
last curve shows that the tape-splicing 
had not significantly decreased the in- 
telligibility of the syllables (Figure 5). 

Although the curves for Set D syll- 
ables overlap, the clusters whose mem- 
bers were four units different from 
each other were more correctly iden- 
tified than the clusters with three units 
of difference on the left of the graph. 
But the clusters which are phonotac- 
tically permissible in English were bet- 
ter identified than both of the above, 
even though the members are also dif- 
ferent by three units. 


Discussion 


With respect to the last two graphs, 
an interesting question can be raised 
concerning a recent hypothesis on the 
frequency of consonant clusters (4, 10). 
This hypothesis assumes that the clus- 
ters whose members differ from each 
other by more units of distinctive fea- 
tures would be more identifiable than 
those with less units of difference. 
Clusters differing trom each other by 
the same number of units would be 
approximately equally perceptible. 

However, in the Set D syllables, the 
English clusters with three units of 
difference were better identified than 
the non-English clusters with four units 
of difference. And, in the Set C syl- 
lables, the members of the six clusters 


presumably differed from each other 
by the same number of units of dis- 
tinctive features. But one sequence of 
occurrence was decidedly more iden- 
tifiable than the other sequence. Con- 
sequently, the data for the Set C and 
Set D syllables in generai do not support 
the assumption of the above hypothesis. 

Theoretically, the question might be 
asked whether perception of linguistic 
units can be adequately specified by a 
set of physical parameters, such as the 
distinctive features. It has been pre- 
viously demonstrated that the pho- 
nemic recognition of synthesized vowels 
partly depends on the physical para- 
meters of neighboring synthesized 
vowels (5). The evidence of the pres- 
ent exploratory study also seems to 
suggest that correct identification of 
consonant clusters depends on the se- 
quence of occurrence within the clus- 
ter and on the relationship between the 
consonants and the contiguous vowels. 

Some interesting questions which 
have been raised previously are con- 
cerned with whether perceptibility can 
be predicted better by describing the 
units of difference between the par- 
ticular allophones involved in the clus- 
ter than it can be by describing the 
units of difference between the pho- 
nemes so involved. Should these units 
of difference be weighted somehow as 
to their relative strength as perceptual 
cues? 

These questions perhaps can lead to 
the development of an absolute scale 
of measurement for phonetic differ- 
ences between speech sounds. Ideally, 
the developmental process would be 
based on experimental data and the 
procedure would be independent of 
linguistic codes. It is reasonable to be- 
lieve that such a scale would tend to 
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improve the testing of the perceptual 
properties of speech sounds and of 
their distribution in various languages. 


Summary 


The relative significance of various 
acoustical cues in the perception of 
final plosive consonants was investi- 
gated. Human speech and systematic 
modifications of it were used as the 
test stimuli. Suggestions were made in 
the direction of an absolute scale of 
measurement for phonetic differences 
between speech sounds regardless of 
linguistic code. 
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Frequency Analysis 


Of Electroencephalograms 


Of Stutterers And Nonstutterers 


JOHN R. KNOTT 


ROBERT E. CORRELL 


JEAN N. SHEPHERD 


In the past, several investigations have 
sought differences between the electro- 
encephalograms (EEGs) of stutterers 
and nonstutterers. Travis and Knott 
(8, 9) found minor and unsystematic 
differences. Their studies were oriented 
around the lateral dominance theory of 
stuttering and purported to demon- 
strate greater bilateral asymmetry in 
stutterers than nonstutterers. However, 
this difference was found only during 


silence and there was a curious cor-: 


relation between severity of stuttering 
and increased bilateral matching, as in- 
dicated by a rank order correlation co- 
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efficient of —.66. Douglass (2) re- 
pored that there were differences be- 
tween left and right hemispheres when 
measures of percent-time alpha activity 
(8-13 per second occipital rhythm) 
were used as the discriminating scores. 
The mean percent-time alpha in the 
left hemisphere exceeded that of the 
right in the stuttering group but did 
not so exceed in the nonstuttering 
group. Knott and Tjossem (5) re- 
peated this study and confirmed ' the 
Douglass finding. 

Since these studies were feported, 
theories of stuttering have shifted away 
from neurological bases. Contemporary 
concepts place a relatively greater em- 
phasis on psychological factors as hav- 
ing etiological significance. Wischner 
(12) has carried out a comprehensive 
analysis of stuttering behavior in terms 
of learning theory, for example. In sev- 
eral of the current theories terms such 
as ‘expectancy’, ‘anxiety’ and ‘fear’ have 
been used in the formulations of stut- 
tering behavior. Stuttering may be re- 
garded as motivated by anxiety, the 
anxiety being reinforced by the speech 
behavior itself. It seems that the terms 
cited above may tend to be used inter- 
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changeably and that usage varies in 
undefined ways from theorist to theo- 
rist. Johnson and his co-workers (4), 
for instance, have used the term ‘expec- 
tancy’ in such a way as to suggest 
‘anxiety,’ although this is based merely 
on the experimentally demonstrable 
fact that stutterers predict with great 
statistical accuracy those words which 
produce stuttering behavior. 

As a psychological construct, ‘fear’ 
may be defined as an anticipatory pain 
response, the antecedent condition of 
which is punishment. When the arous- 
ing cues are vague or not explicitly 
known, the term ‘anxiety’ may be ap- 
plied. Both ‘anxiety’ and ‘expectancy’ 
may be associated with mechanisms 
leading to avoidance; but whether or 
not the concepts of ‘expectancy’ and 
‘anxiety’ as used in theories of stutter- 
ing are the same as the concept of ‘anx- 
iety’ employed as a psychiatric con- 
struct is perhaps open to question. 

‘Anxiety’ as a psychiatric construct, 
however, has been subject to neuro- 
physiologic investigation at the cerebral 
level by Ulett and his associates (10, 
11). It is important to stress that these 
investigations have not in any‘ way been 
concerned with stutterers or stutter- 
ing behavior. They reported that a 
group of persons psychiatrically de- 
fined as ‘anxiety-prone’ could be dis- 
criminated from a group of ‘normal’ 
persons on the basis of certain EEG 
characteristics. Using an electronic fre- 
quency analyzer, they were able to 
achieve highly quantitative scores, and 
found that the total voltage generated 
in the alpha band was less in the anx- 
jety-prone than in the normal group. 
When the EEG response to a flashing 
light was investigated, further differ- 
ences between the groups were dem- 


onstrated. Anxiety-prone subjects, on 
the average, showed more ‘driving’ 
(that is, production of EEG rhythms 
at the rate of the flashing light) at fre- 
quencies above 20 flashes per second, 
and more ‘driving’ below 7 fps, but 
less ‘driving’ in the alpha range than 
did the normal subjects. Thus, it could 
be that the EEG may be used to eval- 
uate ‘anxiety-proneness’ from a_ basic 
neurophysiologic standpoint and that 
some etiologic assumptions may be 
forthcoming. 

Statement of the Problem. In view of 
the current application of the term 
‘anxiety’ and related concepts in the 
evaluation of stuttering behavior, an 
EEG study of groups of stutterers and 
nonstutterers, using the Ulett measures, 
might provide further information 
about neurophysiological differences 
between such groups. Such measures 
would have the added advantage of be- 
ing divorced from speech behavior, 
per se, as no speaking would be re- 
quired in the testing situation. 


Subjects and Method 


Three groups of subjects (total N = 
63) were utilized. Group I was com- 
posed of 19 stutterers available in the 
Speech and Hearing Clinic of the Uni- 
versity of Iowa. The mean age was 23 
years, 4 months. A comparison group 
of nonstutterers (N = 20) was com- 
posed of volunteer students from the 
University’s Department of Psychology 
and the Department of Speech Pathol- 
ogy and Audiology. This group had a 
mean age of 23 years, 9 months. A sec- 
ond group of stutterers (N = 24) was 
selected from the previous source at a 
later date (in about two years) and 
compared with the first stuttering 
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group. The mean age of the Group If 
stutterers was 27 years, 0 months. 

In the course of the experiments, one 
subject was dropped from Group I for 
analysis of photic stimulation data, and 
four from Group II, because of equip- 
ment failure or (one case) inability of 
the subject to tolerate the subjective 
sensations induced by the flickering 
light. 

EEGs were recorded on an eight- 
channel Grass Model III electroen- 
cephalograph. An additional channel 
was provided to record the output of a 
Walter-type low frequency analyzer 
(6). This latter instrument provided 
an electronic analysis of the voltage 
present at each of 23 frequencies, over 
a range of 2 cps to 30 cps. The 
summed voltage present in successive 
10-second epochs is automatically 
plotted directly on the EEG record 
for the 10-second period concerned, 
and the amplitude of pen deflection of 
the analyzer channel at any selected 
frequency is considered as proportional 
to the summed voltage at that fre- 
quency within the epoch. 

The analyzer was calibrated daily 
with respect to frequency stability, 
‘Q’ (that is, band width at each fre- 
quency) and gain (at each frequency). 

In the treatment of the analyzer out- 
put, the voltage at each frequency was 
converted to percentage of total vol- 
tage for the epoch being analyzed. The 
score, therefore, was converted to a 
relative measure and not treated as an 
absolute datum. This conversion was 
effected to avoid difficulties inherent in 
frequency analysis when group data 
are to be used and when different am- 
plification is employed from subject 
to subject, as has been pointed out by 


Gibbs and Knott (3). 


Photic stimulation was carried out 
with a locally constructed stroboscopic 
unit, using a GR 641 P-1 lamp. Dura- 
tion and intensity were constant at all 
flash rates. A photo-electric cell, fed 
into one channel of the EEG amplifiers, 
served as a monitor. 


The EEG electrodes were of the 
monopod type and were secured to the 
scalp with a flexible cap-type holder. 
A total of 14 leads was applied (left 
and right occipital, parietal, precentral, 
frontal, posterior temporal, anterior 
temporal, and separate reference leads 
on the lobe of each ear). For purposes 
of analysis of data, only the left and 
right occipital-parietal bipolar linkages 
were used for the observations based on 
response to flicker, and the left-occipi- 
tal-left-ear and right-occipital-right-ear 
(monopolar) linkages were employed 
to evaluate differences in hemispheric 
alpha output. This latter linkage gave 
data directly comparable to those of 
Douglass (2) and Knott and Tjossem 
(5). 

The flash rates chosen were at 3, 5, 
8, 10, 15, 20 and 30 flashes per second. 
Since interhemisphere differences in 
EEG response were to be measured, the 
flash sequence was presented twice; the 
first sequence gave data for the left 
hemisphere, the second, for the right 
hemisphere. Stroboscopic stimulation 
was carried out for three complete 
analyzer epochs (30 sec) and periods of 
equal length separated the stimulation 
intervals. 

The analyzer data from photic stim- 
ulation were converted to magnitude 
of driving by comparing the magnitude 
of mean response for three epochs at 
the fundamental frequency (that is, 
flash rate, 3 per second; analyzer out- 
put, 3 per second) during the stimula- 
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FREQUENCY PER SEC. 


Ficure 1. Relative voltage, based on fre- 
quency analysis of left-occipital, left-parietal 
electrode pair (at frequency points from 2 
through 30 per sec) for Group 1 and Group 
2 stutterers and for the nonstuttering com- 
parison group. LO=left occipital, LP= 
left parietal. 


tion interval with the magnitude of the 
mean response in the three epochs pre- 
ceding stimulation. Differences in har- 
monic output were similarly compared 
(that is, flash rate 3 per second, output 
at 6, 9, 12, 15, 18, 24, 27 and 30 per 
second). In actual practice, harmonic 
output changes were observed only at 
20 per second for flash rates of 5 per 
second (fourth harmonic) and 10 per 
second (second harmonic) and at 30 


per second as the second harmonic of 
15 fps. 


Results 


The resting EEGs were compared, 
utilizing data from Group I and Group 
If (stutterers) and the comparison 
group (nonstutterers). The data were, 
as described above, converted to rela- 
tive voltage at each frequency and an 
analysis of variance carried out. Figure 
1 presents combined left and right 
hemisphere occipital-parietal linkage re- 
sults, giving the average voltage-fre- 
quency plot for each of the three 
groups. The major divergence occurred 
between stuttering Group.I and stut- 





tering Group II; the group-by-fre- 
quency-interaction was significant at 
the 1% level of confidence. The com- 
parison of the nonstutterers and the 
stutterers of Group II was similarly 
significant at the 1% level of confi- 
dence. Thus, using stuttering Group 
II, only, there appears to be a relative 
reduction over the alpha frequency 
band (8 to 13 per second) and relative 
increase over part of the beta fre- 
quency band (above 14 per second). 
This conclusion cannot be reached for 
stuttering Group I, however, so that 
no general difference appears between 
stutterers and nonstutterers. 

Summed-alpha-band measures for 
the three groups were compared by an 
analysis of variance, for the right and 
left hemispheres, independently. No 
differences between the right hemi- 
sphere summed-alpha measures were 
significant. For the left hemisphere, as 
would be expected from the first analy- 
sis, Group II was significantly below 
Group I (p = 0.01) and also below the 
nonstuttering comparison group (p = 
0.05). 

When Group I was compared to the 
nonstutterers, the interaction was sig- 
nificant at the 5% level of confidence. 
This appears to relate to some eleva- 
tion at 6 per second in Group I. Al- 
though this was a stuttering group, a 
similar relative voltage rise at that (or 
adjacent) frequencies was not apparent 
in stuttering Group II. 

Frequency by hemisphere interac- 
tion was significant at the 5% level 
between Group I and II. Hence, again, 


differences between groups of cones 


ers were greater than between groups 
of stutterers and nonstutterers, and no 
meaningful and consistent differences 
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between such groups could be dem- 
onstrated. 

Two additional comparisons were 
made for the Group I stutterers and 
the nonstutterers. The relative activity 
in the alpha range for left and right oc- 
cipital-to-ear linkage (monopolar) was 
summed for each subject. The amount 
of alpha activity over the right occipi- 
tal area was compared by the ¢ test for 
unrelated measures. The difference be- 
tween the Group I stutterers and the 
nonstutterers was not statistically sig- 
nificant. 

For each subject (Group I and non- 
stutterers), the difference between 
summed alpha of the left and the right 
occipital areas (monopolar linkage) 
was computed. The mean difference 
between left occipital and right occipi- 
tal summed-al;ha activity (left oc- 
cipital minus right occipital) was 3.60. 
This was significant at the 1% level. 
However, a test of the variance ratios 
for the two samples indicates that the 
variances of the two distributions of 


left-occipital-minus-right-occipital dif-* 


ferences are heterogeneous. Thus, the 
t test in this instance is only an ap- 
proximate test, although Norton (7) 
has suggested that if one doubles the 
probability level this could be accepted 
as reflecting a difference in the means. 
The results of photic stimulation at 
various flash rates may be presented. 
No significant difference, by analysis 
of variance, could be demonstrated be- 
tween nonstutterers and the Group I 
stutterers when response at the funda- 
mental frequency (that is, at the flash 
rate) was measured at flash rates of 3, 
5, 8, 15, 20 and 30 per second. How- 
ever, at 11 of these 12 frequencies the 
stutterers exceeded the nonstutterers. 
No significant difference could be 


demonstrated between the fundamental 
responses of Groups I and II; they 
were therefore combined and com- 
pared with the nonstutterers. No statis- 
tically significant difference could be 
demonstrated between the combined 
stutterers and the nonstutterers, al- 
though, again, at 11 of the 12 flash rates 
there was a (mean) greater magnitude 
for the stutterers. At 10 flashes per 
second, due to apparatus problems, 
there tended to be over-driving of the 
analyzer writeout pen, so that a skewed 
distribution of responses occurred. 
Therefore, the chi-square test was ap- 
plied to these data. No significant dif- 
ferences occurred between Groups I 
and II and the nonstutterers. 

Differences in magnitude of response 
at EEG frequencies falling at har- 
monics of the flash rates were eval- 
uated by the chi-square method. Each 
group was compared to the other two 
at: fourth harmonic of 5 fps, second 
harmonic of 10 fps and second _har- 
monic of 15 fps. None of the differ- 
ences reached the 5% level of sig- 
nificance. 


Discussion 


Both of the groups of stutterers 
could be shown to differ from the non- 
stuttering group, in one way or an- 
other, but the two groups of stutterers 
differed more extensively from each 
other than from the nonstuttering 
group. In only Group II was there a 
difference in alpha band activity which 
could be said to follow the direction 
of the Ulett ‘anxiety-prone’ group. This 
group also showed a peak at 16 per 
second similar to that reported by 
Brazier, Finesinger and Cobb (J) in a 
neurotic patient group, but an unpub- 
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lished collection of data (Knott), based 
on young adult normal subjects, shows 
a rise in voltage at 15 per second. 
Group I, on the other hand, showed 
neither a lower amount of voltage in 
the alpha band nor an increase in the 
beta band, but showed a slight rise at 
6 per second, in the theta band. This 
did not appear, significantly, in Group 
II. 

Photic stimulation failed to reveal 
significant differences between the 
groups. While the Ulett sample of 
‘anxiety-prone’ subjects showed higher 
voltage harmonic activity, when that 
activity fell in the 20 to 30 per second 
range, such a tendency could not be 
demonstrated in the stutterers investi- 
gated in this study. 

Some comment might be offered 
about the differences between the two 
stuttering groups (I and II). Several 
conceivable factors may have entered 
to produce these differences, although 
they are certainly minor. Group I was 
examined by a male experimenter and 
Group II by a female; all groups were 
predominantly male. Situational psy- 
chological tension does have an effect 
on the EEG, leading to a lower voltage 
of alpha band activity and relatively 
more beta band activity; if the male 
stutterers were embarrassed by the fact 
that they were stutterers, when in the 
presence of an attractive young wom- 
an, this could have possibly influenced 
the data. It is also possible that two 
extremes of sampling occurred. Knott 
(unpublished data) has reported an al- 
pha peak voltage midway between the 
peak voltages of these two samples of 
stutterers. Those investigators dealing 
with so-called ‘smali samples’ should, in 
spite of the availability of ‘small sample 


techniques of analysis,’ be wary of such 
a possibility. 

With regard to the data based on re- 
sponse to flickering light, two differ- 
ences exist between the Ulett investiga- 
tion and this. Ulett’s data are based on 
absolute measures of voltage output at 
each frequency, while these data are 
based on relative measures. Ulett and 
his group used a light source so con- 
structed that the duration of the flash 
was inversely proportional to the rate 
of the flash. Also, the light-dark ratios 
were equal at any given flash rate. 

When the three groups here studied 
are considered, only one group, stutter- 
ing Group II, could be said to resemble 
the Ulett ‘anxiety-prone’ group, and 
this resemblance held only in terms of 
the voltage-frequency data of the rest- 
ing EEG. To the extent that such 
data reflect some biological substrate 
for ‘anxiety-proneness,’ then one could 
say that this group shows some evi- 
dence of such a substrate. However, 
the failure for this or any other of the- 
Ulett measures to indicate such a cor- 
relate of ‘amxiety-proneness’ in Group 
I militates against any conclusion that 
there is such a biological substrate in 
‘stutterers in general,’ at least insofar as 
Group I affords a sample of the total 
population of stutterers. 


Summary 


Neurophysiological differences be- 
tween two groups of stutterers and a 
comparison group of nonstutterers 
were investigated in an EEG study, 
using the Ulett measures. In one stut- 
tering group there was a difference in 
alpha band activity which seemed to 
follow the Ulett ‘anxiety-prone’ group, 
but this was not true for the other 
stuttering group. On the basis of the 
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various measures, both stuttering groups 
differed from the nonstuttering group 
in one way or another, but the two 
stuttering groups differed more exten- 
sively from each other than from the 
nonstuttering group. 


References 


1. 


Brazier, Mary A. B., Finesincer, J. E., 
and Coss, S., A contrast between the 
electroencephalograms of 100 psycho- 
neurotic patients and those of 500 normal 
adults. Amer. J. Psyciiat., 101, 1945, 
443-448. 


. Dovetass, L. C., A study of bilaterally 


recorded electroencephalograms of adult 
stutterers. J. exp. Psychol., 32, 1943, 
247-265. 


. Gress, F. A., and Knorr, J. R., Growth 


of the electrical activity of the cortex. 
Electroenceph. clin. Neurophysiol., 1, 
1949, 223-229. 


- Kwort, J. R., Jounson, W., and Wes- 


ster, Mary J., Studies in the psychology 
of stuttering: II. A quantitative evaluation 
of expectation of stuttering in relation to 
the occurrence of stuttering. J. Speech 
Dis., 2, 1937, 20-22. 


. Knorr, J. R., and Tyossem, T. D., Bilat- 


eral electroencephalograms from normal 


speakers and stutterers. J. exp. Psychol., . 


32, 1943, 357-362. 


. Knott, J. R., Wootery, A., and RanpaLt, 


J., Construction notes on an American 


10. 


12. 


equivalent of the Walter analyzer. Elec- 
troenceph. clin. Neurophysiol., 3, 1951, 
91-96. 


. Norton, Dez W., An empirical investi- 


gation of the effects of nonnormality and 
heterogeneity on the F-test of analysis of 
variance. Unpublished Ph.D. dissertation, 
University of Iowa, 1952. 


. Travis, L. E., and Knorr, j. R., Brain 


potentials from normal speakers and stut- 
terers. J. Psychol., 2, 1936, 137-150. 


. Travis, L. E., and Knort, J. R., Bilater- 


ally recorded brain potentials from nor- 
mal speakers and stutterers. J. Speech Dis., 
2, 1937, 239-241. 

Utett, G. A., Gieser, Gotpine, LAWLER, 
Ann, and Winoxur, G., Psychiatric 
screening of flying personnel. IV: An 
experimental investigation of develop- 
ment of an EEG index of anxiety toler- 
ance by means of photic stimulation— 
its validation by psychological and psy- 
chiatric criteria. USAF School of Avia- 
tion medicine, Project No. 21-37-002, 
Report No. 4, August 1952. (PB 107351) 


. Uxett, G. A., Greser, Goxpine, STARR, 


P., Happock, J., Linctey, L., and Law- 
LER, ANN, Psychiatric screening of flying 
personnel: further studies toward the 
development of an_ electroencephalo- 
graphic screening technique. USAF 
School of Aviation Medicine, Project No. 
21-0202-007, Report No. 5, August 1953. 
(PB 112130) 

Wiscuner, G. J., Stuttering behavior and 
learning: a preliminary theoretical formu- 
lation. J. Speech Hearing Dis., 15, 1950, 
324-335. 





—__--F— DTH vA or 


RA —_—_—_ -F 


ee ee ee ee ee ee ee ee ee 


eS ee aes tee > a ae he 





Equally Contributing Frequency Bands 


In Intelligibility Testing 
JOHN W. BLACK 


A persistent question in intelligibility 
testing is concerned with how scores 
assigned by two tests agree. Is the 
choosing of one of four possible re- 
sponses in multiple-choice intelligibility 
tests (2) comparable to writing a syl- 
lable or word in other tests? This ar- 
ticle treats the foregoing question from 
the standpoint of the contribution of 
bands of frequencies to intelligibility 
scores. 


Kryter (6) has reviewed the status 
of the articulation index, a development 
of workers at the Bell Telephone Labo- 
ratories (4, 5). He refers to the index 
as ‘the twenty band method’ for pre- 
dicting speech communication. The de- 
rivation of the index, that is, the 20 
bands that contribute equally to in- 
telligibility, has been meticulously ex- 
plained by French and Steinberg (5). 
Kryter distinguished between the use- 
fulness of the articulation index (20 
bands) to predict intelligibility in con- 
nection with frequency distortion and 
its usefulness in connection with mask- 
ing noise, particularly as applied by 
Beranek (J). 


One object of the present study was 
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to determine 20 bands of frequencies 
which contribute equally to multiple- 
choice intelligibility scores and also to 
monosyllabic write-down scores over 
the same system of talkers, equipment 
and _ listeners. 


Procedure 


Five adult males read and recorded 
nine speaker-lists of the multiple-choice 
tests, Forms C and D and the alternative 
Forms C-1 and D-1 (2). The same 
speakers recorded 81 words each from 
the phonetically balanced word lists 
(3). No list or word was read more 
than one time. The speakers were in 
quiet, in a sound-treated room. The re- 
cording equipment included an Altec 
Lansing 21-D condenser microphone 
and an Ampex 350-3 recorder. Units of 
test materials were read at six-second 
intervals. 

The listeners were 44 men who were 
entering the naval pilot training pro- 
gram. Groups A and B included 22 men 
who heard the material under full- 
band and high-pass conditions. Groups 
C and D, the remaining 22, listened 
under full-band and low-pass con- 
ditions. 

Full-band and eight conditions of 
high-pass filtering were involved for 
Groups A and B: 250, 350, 550, 800, 
1300, 2250, 3500 and 5000 cps. Full- 
band and eight conditions of low-pass 
filtering were involved for Groups C 
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Figure 1. Articulation Index versus cut-off 
frequency. All bands are at their optimum 
levels. 


and D: 7000, 5000, 3500, 2250, 1300, 
800, 550 and 450 cps. Nine levels were 
employed, covering a range of 80 db in 
10-db steps. Thus, the listeners experi- 
enced 9 x 9 or 81 combinations of level 
and frequency. The test materials for 
each speaker were viewed also as a 
series of nine blocks of nine stimulus 
units each. For example, nine monosy]- 
lables could be heard at one gain-setting 
and nine conditions of filtering, or in 
one condition of filtering and at nine 
gain-settings. The nine phrases of a mul- 
tiple-choice speaker-list offered the 
same possibilities. (There were three 
times as many multiple-choice items as 
monosyllables. Listening time was 
evenly divided.) 

Group A (11 listeners) heard the 
monosyllables and the multiple-choice 
material with each block of nine items 
at one gain-setting and in a succession 


of nine conditions of high-pass filter- , 


ing. Group B (11 listeners) heard the 
same material with each block of nine 
gain-settings. Groups C and D followed 
the same plan as Groups A and B, but 
heard conditions of low-pass filtering 
instead of high-pass. 

The order was from a full-band sig- 
nal to the maximum degrading of the 
signal. Two SKL filters, Model 302, 


provided 36 db-per-octave attenuation. 

The analytical procedures detailed 
by French and Steinberg (5) and il- 
lustrated in their Figures 10 to 16 
were followed. 


Results 


The articulation index as a function 
of frequency for the multiple-choice 
and the monosyllabic materials is 
plotted in Figure 1. The comparable in- 
dex reported by French and Steinberg 
also is plotted. 
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Ficure 2. Intelligibility score as a function 
of Articulation Index. 


Figure 2 shows the growth curve of 
the intelligibility scores as a function of 
the articulation index. This figure in- 
vites comparison with Kryter’s sum- 
mary of similar data (6), compiled both 
by him and by workers at the Bell 
Telephone Laboratory. The upper 
curve of Figure 2 is comparable to 
‘isolated words, BTL’ of the Kryter 
figure and the lower curve never more 
than 10 percentage points displaced 
from ‘PB words, Pickett and Kryter.’ 

The uniqueness of each communica- 
tion system in evaluations such as the 
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Taste 1. Twenty frequency bands making equal 
contribution to the Articulation Index when each 
is contributing optimally. 











Multiple- Write-Down French and 
Choice Monosyllables Steinberg 

1 150- 230 300- 400 250- 375 
2 230- 400 400- 500 375- 505 
3 400- 565 500- 600 505- 645 
4 565- 700 600- 690 645- 795 
5 700- 850 690- 760 795- 955 
6 850-1000 760- 850 955-1130 
7 1000-1150 850- 950 1130-1315 
8 1150-1350 950-1100 1315-1515 
9 1850-1500 1100-1250 1515-1720 
10 1500-1700 1250-1450 1720-1930 
11 1700-1950 1450-1750 1930-2140 
12 1950-2200 1750-1975 2140-2355 
13 2200-2550 1975-2200 2355-2600 
14 2550-2800 2200-2350 2600-2900 
15 2800-3000 2350-2550 2900-3255 
16 3000-3400 2550-2700 3255-3680 
17 3400-3800 2700-2950 3680-4200 
18 3800-4700 2950-3450 4200-4860 
19 4700-5500 3450-5000 4860-5720 
20 5500-7000 5000-7000 5720-7000 








present one is usually stressed. The de- 
gree of similarity among the various 
systems that have been tested in terms 
of articulation index is not a statistical 
matter; obviously the present systems 
fall within the range of the others that 
have been reported. More specifically 
Table 1 permits a comparison of the 
present systems (including such items 
as materials, speakers, transmission line, 
listeners) to that of French and Stein- 
berg. Up to 1500-1700 cps, the system 
with multiple-choice material yielded 
an Articulation Index of 0.5 while the 
French and Steinberg result was 0.40 to 
0.45. Frequencies above 3500 cps con- 
tributed 10 to 20% to the present out- 
come and slightly more to the French 
and Steinberg result. 

Of prime interest, however, the mul- 
tiple-choice tests and the write-down 
monosyllabic tests were related to band 
width in about the same manner. The 
upper limits of filter number /0 were 
1450 and 1700 cps; of filter 5, 760 and 
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850 cps; of filter 15, 2550 and 3000 
cps. 


Summary 


The purpose of this study was to de- 
termine in the manner of French and 
Steinberg the 20 bands of frequencies 
that contribute equally to intelligibility 
scores obtained by the use of multiple- 
choice tests, Forms C and D. (In mak- 
ing a response on these tests listeners 
attempt to identify which of four sim- 
ilar-sounding words was spoken. The 
intelligibility of a speaker or of a con- 
dition of testing is ordinarily deter- 
mined by the collective responses of a 
panel of listeners to 27 stimuli.) In the 
present study the speakers and listeners 
who spoke and responded to multiple- 
choice tests also spoke and heard com- 
mon monosyllables in the manner of 
write-down intelligibility testing. The 
results indicated the 20 bands of fre- 
quencies that contribute equally to cor- 
rect responses in hearing and writing 
monosyllabic words. 
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Performances Of Normal-Hearing 
And Hard-Of-Hearing Persons 
On The Delayed Feedback Task 


ROBERT A. BUTLER 


F. THOMAS GALLOWAY 


In an attempt to devise special hear- 
ing tests for suspected malingerers, 
considerable emphasis has been placed 
-on the delayed speech feedback task 
(2, 3, 4). The rationale underlying its 
use in detecting malingering is straight- 
forward: delayed speech feedback, 
presented at moderate to high inten- 
sity levels (from 40 to 80 db above 
speech reception threshold) usually 
disrupts ongoing speech. If the person 
under examination exhibits speech dis- 
turbance, it is assumed that his hearing 
is sufficiently acute to perceive the de- 
layed signal. Hence, his threshold of 
hearing for speech must be somewhere 
below that intensity level used in the 
feedback task. In almost every instance, 
a person with normal hearing will dem- 
onstrate speech disruption when the 
delayed feedback is presented at a high 
intensity level. There is little difficulty 
then, in exposing, by means of the de- 
layed speech feedback task, a normal 
hearing perscn who feigns a total loss 
of hearing. However, other techniques, 
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some much simpler than delayed feed- 
back, can be successfully employed in 
this type of case. The real problem 
occurs in instances where the patient 
may have a mild (40 db or less) hear- 
ing loss coupled with a severe nonor- 
ganic overlay. It may be quite apparent 
that the patient under examination does 
not have a total or severe hearing loss, 
but it is frequently difficult to state un- 
equivocally that his hearing falls within 
normal limits. 

The present experiment was designed 
to find out whether the delayed speech 
feedback task can effectively differen- 
tiate persons with mild to moderate 
hearing losses from those with normal 
hearing. The problem is not as simple 
as it might appear. To be more specific, 
for any given absolute intensity level 
of delayed feedback, it seems reason- 
able to expect that the speaking per- 
formance of a person with a hearing 
loss would be less affected than the 
speech of a normal hearing person. 
Two factors, however, operate to in- 
fluence adversely the accuracy of the 
prediction: (a) individuals possessing 
normal hearing vary markedly with 
respect to the intensity of feedback 
necessary to disrupt speech, and (b) 
aside from the large individual differ- 
ences in performance that exist under 


March 1959 








—~ 


feedback conditions, the apparent loud- 
ness of speech for some hard-of-hearing 
patients does not follow the intensity- 
loudness function described for persons 
with normal hearing. At a fixed inten- 
sity level, the delayed feedback may ap- 
pear as loud for the pathological ear 
as it does for the normal ear. Loudness 
recruitment is particularly relevant to 
the delayed speech feedback task, since 
dramatic and indisputable effects are 
usually attained only at high inten- 
sity levels. Because of the recruitment 
phenomenon, some hard-of-hearing pa- 
tients would be likely to perform as 
if they had normal hearing. The pres- 
ent study was designed specifically to 
test this hypothesis. 


Procedure 


The subjects were 60 hard-of-hear- 
ing and 48 normal-hearing individuals, 
108 in all. To ensure that the hard-of- 
hearing subjects had legitimate hearing 
losses, only those persons were selected 
who had (a) reliable audiograms, (b) 
audiograms consistent with perform- 
ances on speech tests, (c) hearing loss 
for speech of 20 db or greater and (d) 
audiological and otological data in 
agreement with respect to the type of 
hearing loss. One-half of the 60 hard- 
of-hearing subjects had a perceptive 
type hearing loss. Their speech recep- 
tion thresholds ranged from 20 to 50 
db, with a mean of 31.1 db. The other 
30 hard-of-hearing subjects had a con- 
ductive hearing loss with speech recep- 
tion thresholds ranging from 20 to 52.5 
db, with a mean of 36.7 db. 

In this study, the technique used for 
eliciting speech differed from that usu- 
ally employed when studying the in- 
fluence of delayed feedback in speaking 
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performance. Instead of reading a 
phrase or paragraph aloud, the subject 
was required to repeat two-digit num- 
bers as they were flashed singly on a 
glass panel. The participants were seat- 
ed in a sound-treated room facing the 
panel fixed to the wall. On the first 
series of numbers presented, the sub- 
jects were asked to observe the panel 
in order to acquaint themselves with 
the rate at which the numbers ap- 
peared. This phase of the testing pro- 
cedure enabled the subjects to learn 
the type of task that they would be 
required to perform. They were next 
told that, on the remaining five series 
of number presentations, they were to 
read the numbers aloud as rapidly and 
as accurately as possible. Each subject 
was informed that his voice would be 
heard in the earphones on some of the 
series. He was requested to try to ig- 
nore the sounds by concentrating on 
repeating the numbers as they flashed 
on the panel. Finally, the subject was 
told that the experimenter would sig- 
nal him immediately before the onset 
of each series of numbers. The experi- 
menter, having completed the instruc- 
tions, placed a third hand, supporting 
a microphone, over the subject’s shoul- 
ders and fitted a pair of earphones to 
the subject’s ears. 

The material read by the subjects 
consisted of five two-digit numbers (24, 
31, 58, 63, 82), illuminated singly on 
the panel. A series consisted of 50 con- 
secutively appearing numbers. 

The order of presentation was ir- 
regular; however, each number ap- 
peared an equal number of times during 
a series. The numbers were presented 
at a rate of two per second with each 
number visible for 300 msec. 

The order of presenting the experi- 





86 Journal of Speech and Hearing Research 








EF Recorder Attenuator -——; 

Monitor 

Phones POR-8 Stimulus 
Earphones Panel 


Microphone 






































Pulse Relay 





Generotor Bank 

















Counter 











Ficure 1. Diagram of apparatus used to 
’ study effect of delayed feedback on normal- 
hearing and hard-of-hearing subjects. Two- 
digit numbers, flashed on stimulus panel, 
were repeated by subject under conditions 
of delayed feedback and no delayed feedback. 
Verbal responses were picked up by micro- 
hone connected to tape recorder. For de- 
bisa feedback conditions, recorded voice 
was played back to speaker through ear- 
phones. Intensity of feedback was controlled 
by attenuator between recorder and phones. 


mental conditions to the subjects re- 
mained invariant: Series 1, observe 
number; Series 2, no delayed feedback; 
Series 3, 50 db delayed feedback; Ser- 
ies 4, no delayed feedback; Series 5, 
80 db delayed feedback; Series 6, no 
delayed feedback. The intensities, 50 
and 80 db, refer to the number of dec- 
ibels above the median detection 
threshold for delayed speech feedback 
as measured on a group of nine normal- 
hearing persons. The delay time used in 
the delayed feedback condition was .17 
sec. 

The apparatus used in this study is 
diagrammatically presented in Figure 1. 
A pulse generator, Grass instrument S- 
4-A, controlled the rate and duration 
of pulses delivered to the master relay. 
The master relay, in conjunction with 


a stepping relay, served to illuminate 
the numbers on the panel in accord- 
ance with a prearranged order-of- 
number presentation. In addition, the 
master relay advanced a Vedar Root 
counter one digit each time a number 
appeared on the panel. And, as men- 
tioned earlier, a series consisted of 50 
number presentations. The subject’s 
verbal response to the numbers was 
picked up by a microphone, Electro- 
Voice, 655, which was connected to 
a tape recorder, Magnecord PT-6-J. 
After a delay of .17 sec the recorded 
voice was played back to the speaker 
through the earphones, Permaflux 
PDR-8. The intensity of the delayed 
feedback was controlled by an attenua- 
tor, Daven 350A, which was inserted 
between the tape recorder and the 
phones. 


The experimenter, wearing earphones 
connected to the tape recorder, moni- 
tored the subject’s performance and 
recorded on a hand counter the fre- 
quency of correctly repeated numbers. 
A number was accepted as being re- 
peated correctly if the speaker’s verbal 
response was intelligible to the experi- 
menter. It did not matter whether the 
speaker prolongated or interrupted the 
word one or more times. Since the ver- 
bal responses were restricted to five 
different numbers, the monitoring task 
was not a particularly difficult one. 
Three different experimenters conduct- 
ed the tests. Agreement among experi- 
menters with respect to scoring per- 
formances was extremely high with 
rarely a discrepancy of more than one 
count. 

To calculate the effect of delayed 
speech feedback at 50 db, the number 
of digit-pairs repeated correctly in Se- 
ries 3 was subtracted from the average 
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number of digit-pairs repeated correct- 
ly in Series 2 and 4. This difference 
score is referred to as the error score. 
The error score for the 80 db feedback 
condition was calculated by subtracting 
the score recorded on Series 5 from 
the average of the scores made on Se- 
ries 4 and 6. 

In view of the fact that this way of 
investigating the effect of delayed 
speech feedback on speaking perform- 
ance deviates from that customarily 
used, it seems appropriate to mention 
the similarities and differences between 
the two techniques. What is reflected 
in the error score, described above, 
is nothing more than the difference in 
reading rate between performances 
with and without delayed feedback. 
The subject, repeating the numbers 
flashed on the panel, simply cannot 
speak as rapidly when delayed feed- 
back is operating. By the time he fin- 
ishes saying one number, two or three 
others already may have flashed on and 
off the panel. 

This measuring technique provides 
data comparable to that obtained when 
reading rate is used as the measure of 
speech disturbance. Tests on 384 nor- 
mal-hearing persons showed that error 
series increased linearly with progres- 
sive increases in the intensity of the 
delayed feedback. Furthermore, a de- 
lay time of approximately .2 sec was 
maximally effective, with longer and 
shorter delay times producing smaller 
error series (1). 

Actually, it is the authors’ contention 
that the only major difference between 
this technique and the one usually em- 
ployed is the way by which the read- 
ing material is presented to the speaker. 

From previous experience with de- 
layed speech feedback testing, the au- 
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thors were alerted to the shortcomings 
of presenting reading material in the 
form of written paragraphs. In the first 
place, some persons probably counter- 
act, in part, the distracting effects of 
the delayed signal by concentrating 
on the content of the paragraphs. There 
is little question that some subjects at- 
tend to the theme of a paragraph when 
reading under delayed feedback con- 
ditions since they can recite the essential 
information after completion of the 
test. Another and perhaps greater dis- 
advantage in the use of written para- 
graphs is found in instances when a 
subject alternately decreases and in- 
creases his reading rate. The time re- 
quired for reading the passage may be 
little or no longer than that recorded 
when no delay was given, although it 
is obvious that he is disturbed by the 
delayed signal. 

The flashing number technique was 
devised to avoid these objectionable 
characteristics of the written para- 
graph. Since each number appears for 
less than a third of a second, the sub- 
ject’s visual feedback is reduced to a 
near minimum; hence there is no read- 
ing material continuously available up- 
on which to concentrate. Furthermore, 
when the reading material is presented 
at a controlled rate, the subject is un- 
able to compensate for a reduction in 
reading speed by suddenly reading 
very rapidly. Since he cannot possibly 
read at a rate faster than that at which 
the numbers are presented, he can never 
rectify his failure to repeat a number. 


Results 

Only at the 50 db intensity level of 
delayed feedback did the test differen- 
tiate hard-of-hearing from normal- 
hearing persons. The error score for the 
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Taste 1. Analysis of the data for 80 db intensity 
of delayed speech feedback with respect to diag- 
nostic groups and size of error score. The number 
of subjects in each diagnostic group is listed in 
accordance with the size of their error scores. 








Error Score Hard of Hearing Normal Hearing 





12.0 and below 31 22 
12.5 and above 29 26 








majority (42) of hard-of-hearing sub- 
jects was 3.5 or less and the error score 
for the majority (34) of the normal- 
hearing subjects was 4.0 or more. On 
the other hand, the error score for only 
18 of the hard of hearing was 4.0 or 
more and for only 14 of the normal 
hearing was 3.5 or less. This difference 
between group performances was signi- 
ficant beyond the 1% level according to 
the results of a chi-square test. 


TaBLE 2. Mean error score at 50 db and 80 db 
for each diagnostic group. 








Hearing Diagnosis Mean Error Score 





50 db 80 db 
Normal 7.4 14.1 
Perceptive 2.2 12.3 
Conductive 3.2 10.5 








The 80 db intensity level of delayed 
feedback was ineffective in differentiat- 
ing between groups (see Table 1). This 
finding can be attributed, in part, to 
the disproportionate effect of the 80 
db level of feedback on the perform- 
ances of the subjects with a perceptive 
type hearing loss. In Table 2 the mean 
error scores for the 50 and 80 db levels 
are listed separately for normal hearing 
persons and for those with either a per- 
ceptive or a conductive type hearing 
loss. The most informative aspect of 
these data is that the mean error score 
for the perceptively deafened group is 
about the same as that recorded for the 


conductive group when the intensity 
level was 50 db. Increasing the inten- 
sity of the delayed feedback from 50 
to 80 db, however, tended to produce 
a relatively greater effect on the per- 
formance of the perceptive group. This 
group showed an increase of 10.1 in 
the mean error score whereas the mean 
error score for the normal and conduc- 
tive groups increased only 6.7 and 7.3, 
respectively. 

The audiologist, however, is primar- 
ily interested in individual rather than 
group performance and when the data 
are viewed from this standpoint, the 
efficacy of delayed speech feedback 
as a tool is indeed disappointing. Even 
at the 50 db level, nearly 30% of the 
subjects were misclassified with re- 
spect to presence or absence of a hear- 
ing defect. Some hard-of-hearing per- 
sons performed as if they had normal 
hearing and, conversely, some normal- 
hearing persons, showing little or no 
effect of the delayed feedback, behaved 
as did the majority of deafened pa- 
tients. In addition, the results do not 
indicate that this measure of feedback 
disturbance can be used to predict with 
desired accuracy the amount of hear- 
ing loss for individuals. The standard 
error of estimate in predicting speech 
reception threshold of deafened patients 
from their error scores for the 50 db 
with delayed feedback condition was 
10.3 db. Since the results from the 80 
db level failed to differentiate between 
groups, no further analysis of these 
data was performed. 


Discussion 


The data demonstrate the usefulness 
as well as the limitations of one measure 
of speech disturbance obtained from a 
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delayed speech feedback task employed 
for detecting mild to moderate hearing 
losses. The chief weakness is that the 
amount of hearing loss cannot be pre- 
dicted with sufficient accuracy for 
some Clinical purposes. Coupled with 
this is the finding that the performances 
of perceptively deafened patients, some 
of whom were probably exhibiting 
loudness recruitment for the delayed 
speech feedback, preclude the use of 
this technique at high intensity levels. 
This finding is interesting in view of 
the fact that a high intensity level of 
the delayed feedback is required for 
universally eliciting speech impairment 
in normal-hearing persons. At lower 
levels, some persons with normal hear- 
ing are not measurably affected by the 
delayed signal. In the present study, 
for example, all 48 subjects with normal 
hearing were affected at the 80 db 
level whereas four showed no measur- 
able effect when the delayed feedback 
was presented at 50 db. The problem, 
then, is reduced to this: in order not 
to produce a large effect on patients 
with hearing losses, and hence cause 
their performances to overlap nearly 
completely with those of normal-hear- 
ing subjects, it is necessary to set the 
intensity of the delayed signal where 
some persons with normal hearing ex- 
hibit no effect. These data showed that 
50 db of delayed feedback was an ef- 
fective compromise although other 
levels in the neighborhood of 50 db 
may be more efficient in discriminating 
between normal-hearing and_hard-of- 
hearing persons. 

Perhaps a better differentiation be- 
tween the performances of hard-of- 
hearing subjects and those with normal 
hearing could have been achieved if an 
intensity function for delayed feedback 
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effects had been obtained. The authors 
preferred not to use this procedure 
because they believed it provides the 
subject with an opportunity to adapt to 
low intensity levels of delayed feed- 
back which, in turn, might influence 
his performance on subsequent tests. 


Regarding the usefulness of the de- 
layed speech feedback measure de- 
scribed, it provides some evidence as 
to whether a person has a mild to mod- 
erate hearing loss, provided the test is 
given at a moderate intensity level. As 
was mentioned in the beginning of this 
paper, it is difficult in the case of a 
malingerer to rule out the possibility 
that he actually has a legitimate hearing 
loss albeit a mild or moderate one. The 
present delayed speech feedback meas- 
ure can be of some use in these in- 
stances, although the exact amount of 
hearing loss cannot be ascertained by 
this technique. Of course, no single 
so-called objective test of hearing in 
existence can accurately and efficiently 
predict hearing threshold. What ap- 
pears to be needed, until better tech- 
niques are developed, is a battery of ob- 
jective tests which might be selected 
from those already available by mul- 
tiple correlation techniques. Those tests 
which make a significant contribution 
to the accuracy of prediction could 
thus be identified and the resultant pre- 
diction equation might be clinically 
useful in identifying malingerers. 


Summary 


The efficacy of a delayed speech 
feedback measure in differentiating 
normal-hearing from hard-of-hearing 
persons was investigated. The subjects 
were 48 persons with normal hearing 
and 60 who had a mild-to-moderate 
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loss. The results showed that the hard- 
of-hearing group could be distinguished 
from the normal-hearing group when 
the feedback was presented at a mod- 
erate intensity level. At a high level, 
no significant group differences oc- 
curred. The advantages and disadvan- 
tages of one measure of delayed feed- 
back disturbance as an index of hearing 
acuity are discussed. 
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Graduate Theses In 


Speech And Hearing Research, 1957 


FRANKLIN H. KNOWER 


Graduate schools of 53 colleges and 
universities in the United States re- 
ported 171 thesis titles in the field of 
speech and hearing research in 1957. 
Of these, 134 were for master’s degrees 
and 37 were for doctor’s. These 171 
titles are listed below in two ways: 
Kirst they are given by schools, then 
by subject matter. In the first listing, 
the schools are arranged alphabetically 
by the distinguishing word in their 
titles; under each school the theses are 
alphabetized by the names of the au- 
thors. In this listing, each title is 
assigned a number. Second, the numbers 
are used to designate the titles in the 
subject index; asterisks indicate doc- 
toral dissertations. Many titles are in- 
dexed in more than one area of subject 
matter. 


Titles 


University of Arizona 
M.A. Theses 


1. Dewson, James. A comparison between 
observed and recorded nonfluencies in 
the speech of stutterers during repeated 
readings of the same passage. 

2. Griffith, B. Brackett. Adapting the 
conditioned eyeblink hearing test to 
the clinical situation. 

. Ward, Allan L. A phonemic analysis 
of the American English language as 
spoken by Arabic students. 
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Boston University 
M.Ed. Theses 


4. Isserow, Rachelle. A thematic appercep- 
tion comparison of stuttering and non- 
stuttering children. 

5. Massari, Gloria. A case study of the 
speech difficulties of twenty adults 
with foreign dialect (Italian). 

6. McLean, Doris. A planned program of 
study and guidance for parents of 
acoustically handicapped children. 

7. Mindness, Mary. An analysis of fic- 
tional literature for material suitable 
for use in a program of bibliotherapy 
for adolescent stutterers. 

8. Rice, Vera. A planned program of 
study and guidance for parents of 
acoustically handicapped children. 


Bowling Green State University 
M.A. Thesis 


9. Helmke, Ruth Gilson. A comparative 
study of the oral language behavior of 
a group of educable mentally retarded 
children and children of normal intel- 
ligence. 


Brigham Young University 
M.A. Thesis 


10. Barrow, Minnie Sue. An experimental 
study of attitude changes of the moth- 
ers of speech defective children result- 
ing from a speech therapy orientation 
program. 


. Brooklyn College 
M.A. Theses 


11. Becker, Laurence R. The influence of 
reduced sleep on the frequency of 
stuttering. 

12. Bernstein, Carol M. Relationships be- 
tween level of aspiration and level of 
performance in aphasics. 

13. Rosenwasser, Florence. The speech re- 
sults of the pharyngeal flap operation 
for cleft palate. 
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14. Rubin, Morris M. A study of the con- 
sistency and adaptation effects in ten- 
to thirteen-year-old stutterers. 


Columbia University Teachers College 
D.Ed. Thesis 
15. Scholl, Harold M. A study of the 


speech therapy services at St. Barnabas 
Hospital for Chronic Diseases. 


Cornell University 
M.A. Thesis 


16. Erter, Marilyn L. The incidence of 
stuttering in males as compared with 
females on the basis of sex-role iden- 
tification. 


University of Denver 
Ph.D. Theses 


17. Neilson, Dorothy. The relationship be- 
tween the Rorschach movement re- 
sponse and verbal expression in lan- 
guage usage interpretation in terms of 
auto-plastic-allo-plastic polarities _ of 
adaptation. 

18. Tarr, W. Fletcher. An inquiry into the 
existence of personality differences of 
speech defective college freshmen. 


Emerson College 
M.A. Thesis 


19, Crannell, Kenneth Charles. Breathing 
patterns of the individual with a voice 
problem as compared to the trained 
and untrained speaker. 


University of Florida 
M.A. Theses 


20. Glover, Frances. A study of pure tone 
and speech perception of senior citi- 
zens, 80-89 years of age. 

21. Parker, Doris Joy. An experimental 
study of parental counseling regarding 
cleft palate problems. 


Ph.D. Theses 


22. Abbott, Thomas B. A study of observ- 
able mother-child relationships in stut- 
tering and non-stuttering groups. 

23. Shea, William L. The effect of supple- 
mentary parental corrective procedures 
on public school fantail articula- 
tory cases. 


Fresno State College 
M.A. Theses 
24. Ingram, Maria Markle. Speech correc- 


tion in the schools of Fresno, California. 
25. O’Neil, Ann L. A survey of the speech 


correction program of elementary 
schools of Kern County, California. 


University of Hawaii 
M.A. Theses 


26. Hayes, Robert Warren. A phonological 
study of the English speech of selected 
Japanese speakers in Hawaii. 

27. Speigel, Hazel Sachie. A comparative 
study of two methods of teaching 
Speech 101, a course in the sounds and 
rhythms of spoken English. 


University of Illinois 
Ph. D. Theses 


28. Brutten, Eugene Jerome. A colorimetric 
anxiety measure of stuttering and 
expectancy adaptation. 

29. Quigley, Stephen Patrick. The vocal 
effects of delayed auditory feedback 
and their relationships to other indi- 
vidual characteristics. 


Indiana University 
M.A. Theses 


30. Filley, Florence Simon. Stimulability 
as a predictor of speech improvement 
for misarticulations by children re- 
ceiving no therapy. 

31. Kent, Louise Robinson. The use of 
meprobamate as an adjunct to stuttering 
therapy. 

32. Rintelmann, William F. A descriptive 
study involving the testing and pre- 
dicting of improvement of articulation 
of school children. 

33. Willey, Norman. A study of listener 
reaction to ‘normal’ versus stuttering 


speech. 


University of lowa 


M.A. Theses 


34. Duffy, Robert Joseph. Quantitative 
data on the speech nonfluencies of 
adult female stutterers. 

35. Gough, Kenneth Henry. A study of 
the effects of successive sessions of 
continuous oral reading upon adapta- 
tion and spontaneous recovery of the 
stuttered response. 

36. Hardy, James Chester. A phonetic 
study of misarticulation of /r/. 

37. Moodie, Catherine Elizabeth. A com- 
parative study of four psychological 
scaling methods applied to articulation 
defectiveness. 

38. Morris, Hughlett Lewis. A study of 
certain language skills in children with 
cleft palates. 
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Ph.D. Theses 


39. Siegel, Gerald M. An investigation of 
dysphasic speech performance in re- 
sponse to visual word stimuli. 

40. Wendahl, Ronald Wallace. Vowel 
formant frequencies and vocal cavity 
dimensions. 


Kansas State University 
M.A. Theses 


41. Woellner, Alberta. A phonetic study 
of the pronuciation of general Amer- 
ican English spoken by selected for- 
eign students from India and Pakistan. 

42. Adams, Esther Young. The group ther- 
apy approach in a speech program for 
the young cerebral palsied population in 
the public schools. 

43. Edison, George. A study of the rela- 
tionship between speech and motor 
development. 

44, Franks, Beulah. The visual perceptual 
abilities of stutterers and non-stutterers. 

45. Starrett, Jacqueline. A comparison of 
two tests of sound discrimination. 

46. Taylor, Marilyn. Standardization of a 
speech sound discrimination test. 

47. Wright, Beverly Sue. An analysis of 
the relationship between sound dis- 
crimination and articulation. 


Kent State University 
M.A. Thesis 


48. Caskey, Marylou. An approach to ther- 
apy for the primary ‘stutterer. 


Louisiana State University 
M.A. Theses 


49. Domingue, Marilyn Gayle. The effect 
of differential stimuli presentation in 
the articulatory testing of non-readers. 

50. Rynes, Edward Joseph. A survey of 
speech and hearing defects in the ele- 
mentary schools of Baton Rouge. 


University of Maryland 
M.A. Theses 


51. Akiyama, Wallace Y. An investigation 
of relationships among tests of persev- 
eration and other tests of verbal and 
non-verbal abilities. 

52. Becker, E. Rheda. A study of rela- 
tionships between speech intelligibility 
and measures of auditory discrimination 
under conditions of synchronous and 
delayed sidetone. 

53. Bowling, Lloyd S. Observations of 
characteristics of fifteen teenage stut- 
terers. 
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54. Hillis, James W. Relationships be- 
tween tests of word naming and tests 
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MarzkeEr, Josepu, Ein Binauraler Horsynthese- 
Test zum Nachweis Zerebraler Hérstorungen, 
mit 46 Abbildungen (A Test of Binaural 
Auditory Synthesis for the Detection of 
Cerebral Hearing Disorders, with 46 illus- 
trations). Stuttgart: Georg Thiem Verlag, 
1958. Pp. 117. $4.25. 


The author developed this test with the 
purpose of examining specifically the func- 
tion of the central auditory system. In prin- 
ciple the test material, which in this case is 
speech, is divided into two frequency bands, 
from 500-800 cps and from 1500-2400 cps. 
Speech restricted to each band alone is prac- 
tically unintelligible but both bands together 
presented to one or both ears simultaneously 
give good intelligibility. If the two bands 
are presented to each ear separately, the low 
band to one side and the high band to the 
other side, people with normal brain function 
are able to synthesize these two bands; they 
then make only a few errors in repeating 
speech material. In people with dysfunction 
of the central nervous system this ability to 
synthesize is often impaired, and in this case 
significantly more errors are made. This 
result is called ‘binaural test positive.’ 

The practical application of this test was 
studied on 1000 subjects, both normal and 
pathologic. Under conventional physiologic 
conditions, binaural-test-positive results were 
found in 34% of children below 14 years 
and yo gran, 100% of people over 75 
years of age. In the age group from 20 to 60 
years the binaural test was found to be nega- 
tive in 97 to 98.5% of the cases. In the 
pathologic group the tumor cases showed 
the highest rate of positive test results, 84% 
of a total of 38 cases. Lower percentages of 
positive test results were found in other 
groups, such as those with atrophic and vas- 
cular diseases, multiple sclerosis, epilepsy and 
trauma. In many of these cases the hearing 
determined by the classical methods was 
normal. The author emphasizes the impor- 
tance of the BT in these cases where the 
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classical methods do not show any abnor- 
mality at all. On the other hand, in cases 
where abnormalities were found, it is not 
possible to determine reliably and specifically, 
with the classical audiologic methods, if the 
abnormality or part of it is caused centrally. 
Here the BT should prove its value for 
differential diagnosis. Theoretical arguments 
are included in the monograph. 


RueEpIGER THALMANN 
Central Institute for the Deaf 


Harris, Cyrir M., (Editor), Handbook of 
Noise Control. New York: McGraw-Hill 
Book Company, Inc., 1957. Pp. 1053, 763 illus- 
trations. $16.50. 


This volume is an important addition to a 
series issued by one of the leaders in the pub- 
lication of technical handbooks. Dr. Harris 
is a distinguished figure in the field of 
acoustics, and his high standards and good 
judgment in the selection and editing of the 
material for the Handbook of Noise Control 
represent a major contribution to the field. 

The content of the book is very well pre- 
sented. It is amply illustrated with well pre- 
pared figures and is well organized with 
appropriate subheadings. Some who spend 
many hours with the book may occasionally 
find the relatively small print annoying, but 
it also makes possible the presentation of a 
tremendous amount of material within a book 
of reasonable size. 


There is a total of 40 sections to the hand- 
book, prepared by approximately as many 
authors. The author list is essentially a “Who’s 
Who’ in the field of noise analysis and control 
with only a few distinguished names missing. 
In general, those interested in speech and 
hearing will find the first half of the book 
more valuable than the second half. Section 
headings that should be particularly interest- 
ing to those who work in the field of speech 
and hearing are: 

2—Physical Properties of Noise and their 
Specification, #—~The Hearing Mechanism, 
5—The Loudness of Sounds, 6—Audio- 
metric Testing in Industry, 7—Hearing 
Loss Resulting from Noise Exposure, 
8—Ear Protectors, 9—Effects of Noise on 
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Speech, 10—Effects of Noise on Behavior, 
16— Instruments for Noise Measurements, 
17—Noise Measuring Techniques, 35— 
Community Noise and City Planning, 
36—Community Reaction to Noise, 38— 
Legal Liability for Loss of Hearing. 

The book is of considerable importance to 
those who work in the field of hearing and 
is an essential reference for anyone interested 
in the acoustical aspects of communication. 


It is certainly comprehensive and is the major 
reference for questions on noise. There is a 
certain clarity of expression and specificity of 
statement throughout the handbook which 
show the effect of careful editing by Dr. 
Harris. The selected set of references at the 
close of each section provides an excellent 
aid to further study of specific problems. 
Gorpon E. Peterson 

University of Michigan 





Letter To The Editor 


The Editorial Staff assumes no responsibility for 
the opinions expressed in letters. 





Liaison Between ASHA 
and Army? 


The article, “Survey of Hearing Losses 
Among Armor Personnel,” that appeared in 
the December issue of the Journal of Speech 
and Hearing Research was very interesting. 
Unfortunately if it gave anyone the impres- 
sion that the army is adequately oriented to 
problems of speech and hearing, it was 
misleading. 

While inquiring into the possibility of a 
speech therapist or speech pathologist trans- 
ferring an army reserve commission to the 
Medical Services Corps or to the Medical 
Specialists Corps, I collected some information 
that may be of interest to many members 
of the American Speech and Hearing Associ- 
ation. Army Regulation 140-101 states that 
the following persons holding commissions 
may apply for transfer to the corps men- 
tioned: nutritionists, optometrists, chiropo- 
dists, psychologists, social workers, physical 
therapists, occupational therapists, and various 
other professionally trained persons. A doc- 
torate is required of the psychologists, a 
master’s degree of the sucial workers, and 
no degree is necessary for the physical and 
occupational therapists. No mention is made 
in that regulation of speech therapists, speech 
pathologists, or audiologists. 

In reply to a letter directed to the Surgeon 
General, a major from the Personnel and 
Training Division of the Medical Services 
Corps advised me that the army has a relative- 
ly small requirement for speech therapists and 
speech pathologists. Therefore specialists in 
those fields are not generally commissioned 
but are obtained as civilian or enlisted per- 
sonnel. He stated that seldom could an 


officer be utilized in these specialties even 
though he be fully qualified. 

It seems to me this poses two serious prob- 
lems to our profession. 1. Even if the army 
were using only one soldier in his profes- 
sional capacity as a qualified speech therapist 
or speech pathologist, it would be an injus- 
tice to have that individual working as an 
enlisted man alongside commissioned social 
workers, occupational therapists, etc. 2. It 
appears that the army is unaware of the 
speech rehabilitation that has been provided 
disabled soldiers in the past. In the literature 
there are many references to speech prob- 
lems resulting from war injuries. Wepman, in 
Recovery from Apbhasia, mentions that 
during World War II great numbers of 
persons received injuries that caused aphasia. 
He states that a small percentage of these 
injured service men received the benefits 
available in army training centers established 
for the resolution of this type of problem. 
He reports that many other soldiers with 
aphasia were discharged without speech re- 
habilitation. Some of these persons later 
received help from the Veterans Administra- 
tion. If a soldier must await discharge from 
the army and admittance to a Veterans Ad- 
ministration program prior to treatment, he 
will not receive the early care advocated by 
modern writers on aphasia. 

Aphasia is not the only problem involving 
speech that may result from war experiences 
or injuries. Van Riper and Irwin in Voice 
and Articulation indicate that hysterical 
aphonia may be a problem among combat 
troops. A physician, William G. Peacher, 
has written a series of articles on speech 
disorders in World War II. Among the prob- 
lems he discusses are dysarthria, speech de- 
fects resulting from maxillofacial injuries, 
and speech defects resulting from neurologi- 
cal and structural injuries to the tongue. 
(See Plastic and Reconstructive Surgery Vol. 
5, 1950, p. 123). Peacher writes that a speech 
clinic was organized for the United States 
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Army in 1943 at Brooke General Hospital. 
According to H. Loebell in an article re- 
printed in Speech Therapy, A Book of Read- 
ings edited by Van Riper, the German Army 
created a special division for speech and 
voice disorders in 1939. The value of army 
aural rehabilitation centers during World 
War II has been recognized and described. 

Moves to correct the situation described 
above would involve commissioning any 
graduate — therapists or speech patholo- 
gists certified with the American Speech and 
Hearing Association who are serving in a 
professional capacity as enlisted personnel. 
Anyone serving in these capacities without 
adequate education and certification should 
be transferred to other duties. 


It is also necessary that a program be 
planned to utilize speech pathology in mili- 
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tary situations during war time. The litera- 
ture clearly indicates that the military does 
then need a considerable number of speech 
therapists and speech pathologists. In these 
days of planning civil and military reserve 
defense programs, provision should be made 
for the speech rehabilitation of the injured. 
Undoubtedly speech therapists and speech 
pathologists should be stationed in combat 
zones in order to provide the early training 
that contributes to the more rapid and more 
complete recovery of the patient. 

Liaison between the American Speech and 
Hearing Association and the military is 
necessary if such changes are to be effected. 


Ralph L. Shelton, Jr., 

Research Fellow, Department of Pediatrics 
University of Utah, Salt Lake City, Utah 
December 17, 1958 
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